You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by shiv shivaji <sh...@yahoo.com> on 2010/03/03 02:26:13 UTC

Re: Anti-compaction Diskspace issue even when latest patch applied

Thanks, just realized this after looking at the source code.

Seems like the decommission will not work for me due to disk space issues. I am currently moving all the data on the heavy node (5 TB full) to a 12 TB disk drive. I am planning to remove the old token and resign the old token to this node.

According to the docs, it says to use decommission, however lack of disk space does not allow me to do this. If I manually move all the data files and then do a removetoken and start the node with a new token, would that work (as was implied in a JIRA)?

Shiv





________________________________
From: Stu Hood <st...@rackspace.com>
To: cassandra-user@incubator.apache.org
Sent: Sun, February 28, 2010 1:53:29 PM
Subject: Re: Anti-compaction Diskspace issue even when latest patch applied

`nodetool cleanup` is a very expensive process: it performs a major compaction, and should not be done that frequently.

-----Original Message-----
From: "shiv shivaji" <sh...@yahoo.com>
Sent: Sunday, February 28, 2010 3:34pm
To: cassandra-user@incubator.apache.org
Subject: Re: Anti-compaction Diskspace issue even when latest patch applied

Seems like the temporary solution was to run a cron job which calls nodetool cleanup every 5 mins or so. This stopped the disk space from going too low.

The manual solution you mentioned is likely worthy of consideration as the load balancing is taking a while.

I will track the jira issue of anticompaction and diskspace. Thanks for the pointer.


Thanks, Shiv




________________________________
From: Jonathan Ellis <jb...@gmail.com>
To: cassandra-user@incubator.apache.org
Sent: Wed, February 24, 2010 11:34:59 AM
Subject: Re: Anti-compaction Diskspace issue even when latest patch applied

as you noticed, "nodeprobe move" first unloads the data, then moves to
the new position.  so that won't help you here.

If you are using replicationfactor=1, scp the data to the previous
node on the ring, then reduce the original node's token so it isn't
responsible for so much, and run cleanup.  (you can do this w/ higher
RF too, you just have to scp the data more places.)

Finally, you could work on
https://issues.apache.org/jira/browse/CASSANDRA-579 so it doesn't need
to anticompact to disk before moving data.

-Jonathan

On Wed, Feb 24, 2010 at 12:06 PM, shiv shivaji <sh...@yahoo.com> wrote:
> According to the stack trace I get in the log, it makes it look like the
> patch was for anti-compaction but I did not look at the source code in
> detail yet.
>
> java.util.concurrent.ExecutionException:
> java.lang.UnsupportedOperationException: disk full
>         at
> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>         at
> org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.afterExecute(DebuggableThreadPoolExecutor.java:86)
>         at
> org.apache.cassandra.db.CompactionManager$CompactionExecutor.afterExecute(CompactionManager.java:570)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:888)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.UnsupportedOperationException: disk full
>         at
> org.apache.cassandra.db.CompactionManager.doAntiCompaction(CompactionManager.java:344)
>         at
> org.apache.cassandra.db.CompactionManager.doCleanupCompaction(CompactionManager.java:405)
>         at
> org.apache.cassandra.db.CompactionManager.access$400(CompactionManager.java:49)
>         at
> org.apache.cassandra.db.CompactionManager$2.call(CompactionManager.java:130)
>         at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         ... 2 more
>
> I tried "nodetool cleanup" before and it did not really stop the disk from
> filling, is there a way to force move the data or some other way to solve
> the issue?
>
> Thanks, Shiv
>
> ________________________________
> From: Jonathan Ellis <jb...@gmail.com>
> To: cassandra-user@incubator.apache.org
> Sent: Wed, February 24, 2010 7:16:32 AM
> Subject: Re: Anti-compaction Diskspace issue even when latest patch applied
>
> The patch you refer to was to help *compaction*, not *anticompaction*.
>
> If the space is mostly hints for other machines (is that what you
> meant by "due to past problems with others?") you should run nodeprobe
> cleanup on it to remove data that doesn't actually belong on that
> node.
>
> -Jonathan
>
> On Wed, Feb 24, 2010 at 3:09 AM, shiv shivaji <sh...@yahoo.com> wrote:
>> For about 6TB of  total data size with a replication factor of 2 (6TB x 2)
>> on a five node cluster, I see about 4.6 TB on one machine (due to
>> potential
>> past problems with other machines). The machine has a disk of 6TB.
>>
>> The data folder on this machine has 59,289 files totally 4.6 TB. The files
>> are the data, filter and indexes. I see that anti-compaction is running. I
>> applied a recent patch which does not do anti-compaction if disk space is
>> limited. I still see it happening. I have also called nodetool loadbalance
>> on this machine. Seems like it will run out of disk space anyway.
>>
>> The machine diskspace consumed are: (Each machine has a 6TB hard-drive on
>> RAID).
>>
>> Machine Space Consumed
>> M1    4.47 TB
>> M2    2.93 TB
>> M3    1.83 GB
>> M4    56.19 GB
>> M5    398.01 GB
>>
>> How can I force M1 to immediately move its load to M3 and M4 for instance
>> (or any other machine). The nodetool move command moves all data, is there
>> a
>> way instead to force move 50% of data to M3 and the remaining 50% to M4
>> and
>> resume anti-compaction after the move?
>>
>> Thanks, Shiv
>>
>>
>>
>

Re: Anti-compaction Diskspace issue even when latest patch applied

Posted by shiv shivaji <sh...@yahoo.com>.

Ah, will look at the jmx console. Thought it was under nodetool.

content@cl201 ~/swell/cassandra $ iostat -x
Linux 2.6.30-gentoo-r4pb (cl201)     03/05/10     _x86_64_    (8 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           9.66    0.00    2.18    4.98    0.00   83.18

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda              86.37   315.56   53.65   73.00  7534.09  9905.44   137.70     1.47   11.63   2.18  27.66
sdb              24.35   254.96   29.45   67.79  6842.55  9415.54   167.19     0.73    7.50   1.83  17.83
sdc              24.41   254.17   29.44   67.63  6844.34  9389.56   167.23     0.73    7.50   1.83  17.78
sde              24.73   254.26   29.44   67.87  6844.29  9392.18   166.86     0.63    6.50   1.81  17.63
sdd              24.38   254.82   29.44   69.24  6842.92  9417.35   164.76     0.75    7.58   1.83  18.01
sdf              24.35   254.57   29.36   67.37  6842.16  9397.45   167.88     0.69    7.18   1.83  17.68
md0               0.00     0.00    0.60    1.31    18.62    27.21    24.00     0.00    0.00   0.00   0.00
md2               0.00     0.00  322.69 1932.98 41043.23 56337.83    43.17     0.00    0.00   0.00   0.00

Shiv




________________________________
From: Jonathan Ellis <jb...@gmail.com>
To: cassandra-user@incubator.apache.org
Sent: Fri, March 5, 2010 11:52:18 AM
Subject: Re: Anti-compaction Diskspace issue even when latest patch applied

On Fri, Mar 5, 2010 at 1:36 PM, shiv shivaji <sh...@yahoo.com> wrote:
> Sorry, how to get compaction progress with 0.6. Is it in nodetool or
> somewhere else? I tried a few options after nodetool and did not get this
> info.

it's under CompactionManager in jmx.  I'm not sure if nodetool exposes
this but it's easy to find in JConsole mbeans.

> iostats:

what about iostat -x?

-Jonathan

Re: Anti-compaction Diskspace issue even when latest patch applied

Posted by Jonathan Ellis <jb...@gmail.com>.

On Fri, Mar 5, 2010 at 1:36 PM, shiv shivaji <sh...@yahoo.com> wrote:
> Sorry, how to get compaction progress with 0.6. Is it in nodetool or
> somewhere else? I tried a few options after nodetool and did not get this
> info.

it's under CompactionManager in jmx.  I'm not sure if nodetool exposes
this but it's easy to find in JConsole mbeans.

> iostats:

what about iostat -x?

-Jonathan

Re: Anti-compaction Diskspace issue even when latest patch applied

Posted by shiv shivaji <sh...@yahoo.com>.

Sorry, how to get compaction progress with 0.6. Is it in nodetool or somewhere else? I tried a few options after nodetool and did not get this info.

My vmstats are

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 1  0 2234752  85440      0 27270300   43   32  2562  3474   19   21 10  2 83  5
 1  0 2234744  91220      0 27281208   10    0 24893 41788 10330 2482 10  2 81  6
 2  0 2234732 102560      0 27271640   39    0 25230 21048 10300 2346 10  2 82  6
 1  1 2234720 106660      0 27268192    0    0 24730 34483 10700 2563 10  3 81  6


iostats:
Linux 2.6.30-gentoo-r4pb (cl201)     03/05/10     _x86_64_    (8 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           9.62    0.00    2.17    4.95    0.00   83.26

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             126.60      7527.18      9843.04 1252414800 1637741619
sdb              96.95      6828.70      9348.09 1136197889 1555388186
sdc              96.82      6830.49      9324.74 1136496543 1551504064
sde              97.04      6830.45      9327.39 1136488926 1551944491
sdd              98.37      6829.08      9349.83 1136260775 1555678239
sdf              96.46      6828.30      9330.57 1136131741 1552473459
md0               1.94        18.59        27.97    3092790    4653501
md2            2239.84     40960.19     55939.76 6815190175 9307575976

The md2 disk contains the data for cassandra.

Regarding my previous reply, I do not mind working on the issue you mentioned, but have to get manager approval and if it best solves the problem, then great! So far, I am convinced it is related to compaction.

Shiv




________________________________
From: Jonathan Ellis <jb...@gmail.com>
To: cassandra-user@incubator.apache.org
Sent: Fri, March 5, 2010 9:00:19 AM
Subject: Re: Anti-compaction Diskspace issue even when latest patch applied

On Fri, Mar 5, 2010 at 2:13 AM, shiv shivaji <sh...@yahoo.com> wrote:
> 1. Is there a way to estimate the time it would take to compact this work
> load? I hope the load balancing will be much faster after the compaction.
> Curious how fast I can get the transfer once compaction is done.

0.6 gives you compaction progress, so you can estimate from that.

> 2. Any way to make this faster? Is working on the above issue the lowest
> hanging fruit or is there something else?

Not adding new data at the same time would probably make it faster,
although you haven't told us where the bottleneck is.
(http://spyced.blogspot.com/2010/01/linux-performance-basics.html)

-Jonathan

Re: Anti-compaction Diskspace issue even when latest patch applied

Posted by Jonathan Ellis <jb...@gmail.com>.

On Fri, Mar 5, 2010 at 2:13 AM, shiv shivaji <sh...@yahoo.com> wrote:
> 1. Is there a way to estimate the time it would take to compact this work
> load? I hope the load balancing will be much faster after the compaction.
> Curious how fast I can get the transfer once compaction is done.

0.6 gives you compaction progress, so you can estimate from that.

> 2. Any way to make this faster? Is working on the above issue the lowest
> hanging fruit or is there something else?

Not adding new data at the same time would probably make it faster,
although you haven't told us where the bottleneck is.
(http://spyced.blogspot.com/2010/01/linux-performance-basics.html)

-Jonathan

Re: Anti-compaction Diskspace issue even when latest patch applied

Posted by shiv shivaji <sh...@yahoo.com>.

Thanks for the pointer. Wanted to figure out if this is the real bottleneck as there might be something else contributing to the low speed.
Let me explain our setup in more detail:

We are using cassandra to store about 700 million images. This includes image metadata and the image (in binary format).
In the data folder, I see about 53,831 files in the data folder. There are also about 3,940 files which have names like
imagestore-b-66171-Data.db
imagestore-b-66171-Filter.db

I assume the latter are the compacted files. This set of files is indeed growing. 

The problem is that one node is our cluster is 4.5TB full and we are trying to load balance away from that node.
I moved this node to a larger hard disk and trying out the load balance, but it is really slow. 

When I do the load balancing, row mutations are taking place and compaction is still continuing. The load balancing is happening at the rate of 10 MB per half hour. I
suspect this is due to the compaction (and the anti-compaction).

Now for questions:
1. Is there a way to estimate the time it would take to compact this work load? I hope the load balancing will be much faster after the compaction. Curious how fast I can get the transfer once compaction is done.
2. Any way to make this faster? Is working on the above issue the lowest hanging fruit or is there something else?

Thanks, Shiv






________________________________
From: Jonathan Ellis <jb...@gmail.com>
To: cassandra-user@incubator.apache.org
Sent: Thu, March 4, 2010 8:31:16 AM
Subject: Re: Anti-compaction Diskspace issue even when latest patch applied

https://issues.apache.org/jira/browse/CASSANDRA-579 should make a big
difference in speed.  If you want to take a stab at it I can point you
in the right direction. :)

On Thu, Mar 4, 2010 at 10:24 AM, shiv shivaji <sh...@yahoo.com> wrote:
> Yes.
>
> The IP change trick seems to work. Load balancing seems a little slow, but I
> will open a new thread on that if needed.
>
> Thanks, Shiv
>
>
> ________________________________
> From: Jonathan Ellis <jb...@gmail.com>
> To: cassandra-user@incubator.apache.org
> Sent: Wed, March 3, 2010 9:21:28 AM
> Subject: Re: Anti-compaction Diskspace issue even when latest patch applied
>
> You are proposing manually moving your data from a 5TB disk to a 12TB
> disk, and that is the only change you want to make?  Then just keep
> the IP the same when you restart it after moving, and you won't have
> to do anything else, it will just look like the node was down
> temporarily and is now back up.
>
> On Tue, Mar 2, 2010 at 7:26 PM, shiv shivaji <sh...@yahoo.com> wrote:
>> Thanks, just realized this after looking at the source code.
>>
>> Seems like the decommission will not work for me due to disk space issues.
>> I
>> am currently moving all the data on the heavy node (5 TB full) to a 12 TB
>> disk drive. I am planning to remove the old token and resign the old token
>> to this node.
>>
>> According to the docs, it says to use decommission, however lack of disk
>> space does not allow me to do this. If I manually move all the data files
>> and then do a removetoken and start the node with a new token, would that
>> work (as was implied in a JIRA)?
>>
>> Shiv
>>
>>
>> ________________________________
>> From: Stu Hood <st...@rackspace.com>
>> To: cassandra-user@incubator.apache.org
>> Sent: Sun, February 28, 2010 1:53:29 PM
>> Subject: Re: Anti-compaction Diskspace issue even when latest patch
>> applied
>>
>> `nodetool cleanup` is a very expensive process: it performs a major
>> compaction, and should not be done that frequently.
>>
>> -----Original Message-----
>> From: "shiv shivaji" <sh...@yahoo.com>
>> Sent: Sunday, February 28, 2010 3:34pm
>> To: cassandra-user@incubator.apache.org
>> Subject: Re: Anti-compaction Diskspace issue even when latest patch
>> applied
>>
>> Seems like the temporary solution was to run a cron job which calls
>> nodetool
>> cleanup every 5 mins or so. This stopped the disk space from going too
>> low.
>>
>> The manual solution you mentioned is likely worthy of consideration as the
>> load balancing is taking a while.
>>
>> I will track the jira issue of anticompaction and diskspace. Thanks for
>> the
>> pointer.
>>
>>
>> Thanks, Shiv
>>
>>
>>
>>
>> ________________________________
>> From: Jonathan Ellis <jb...@gmail.com>
>> To: cassandra-user@incubator.apache.org
>> Sent: Wed, February 24, 2010 11:34:59 AM
>> Subject: Re: Anti-compaction Diskspace issue even when latest patch
>> applied
>>
>> as you noticed, "nodeprobe move" first unloads the data, then moves to
>> the new position.  so that won't help you here.
>>
>> If you are using replicationfactor=1, scp the data to the previous
>> node on the ring, then reduce the original node's token so it isn't
>> responsible for so much, and run cleanup.  (you can do this w/ higher
>> RF too, you just have to scp the data more places.)
>>
>> Finally, you could work on
>> https://issues.apache.org/jira/browse/CASSANDRA-579 so it doesn't need
>> to anticompact to disk before moving data.
>>
>> -Jonathan
>>
>> On Wed, Feb 24, 2010 at 12:06 PM, shiv shivaji <sh...@yahoo.com>
>> wrote:
>>> According to the stack trace I get in the log, it makes it look like the
>>> patch was for anti-compaction but I did not look at the source code in
>>> detail yet.
>>>
>>> java.util.concurrent.ExecutionException:
>>> java.lang.UnsupportedOperationException: disk full
>>>        at
>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>>>        at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>>>        at
>>>
>>>
>>> org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.afterExecute(DebuggableThreadPoolExecutor.java:86)
>>>        at
>>>
>>>
>>> org.apache.cassandra.db.CompactionManager$CompactionExecutor.afterExecute(CompactionManager.java:570)
>>>        at
>>>
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:888)
>>>        at
>>>
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>        at java.lang.Thread.run(Thread.java:619)
>>> Caused by: java.lang.UnsupportedOperationException: disk full
>>>        at
>>>
>>>
>>> org.apache.cassandra.db.CompactionManager.doAntiCompaction(CompactionManager.java:344)
>>>        at
>>>
>>>
>>> org.apache.cassandra.db.CompactionManager.doCleanupCompaction(CompactionManager.java:405)
>>>        at
>>>
>>>
>>> org.apache.cassandra.db.CompactionManager.access$400(CompactionManager.java:49)
>>>        at
>>>
>>>
>>> org.apache.cassandra.db.CompactionManager$2.call(CompactionManager.java:130)
>>>        at
>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>>        at
>>>
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>        ... 2 more
>>>
>>> I tried "nodetool cleanup" before and it did not really stop the disk
>>> from
>>> filling, is there a way to force move the data or some other way to solve
>>> the issue?
>>>
>>> Thanks, Shiv
>>>
>>> ________________________________
>>> From: Jonathan Ellis <jb...@gmail.com>
>>> To: cassandra-user@incubator.apache.org
>>> Sent: Wed, February 24, 2010 7:16:32 AM
>>> Subject: Re: Anti-compaction Diskspace issue even when latest patch
>>> applied
>>>
>>> The patch you refer to was to help *compaction*, not *anticompaction*.
>>>
>>> If the space is mostly hints for other machines (is that what you
>>> meant by "due to past problems with others?") you should run nodeprobe
>>> cleanup on it to remove data that doesn't actually belong on that
>>> node.
>>>
>>> -Jonathan
>>>
>>> On Wed, Feb 24, 2010 at 3:09 AM, shiv shivaji <sh...@yahoo.com>
>>> wrote:
>>>> For about 6TB of  total data size with a replication factor of 2 (6TB x
>>>> 2)
>>>> on a five node cluster, I see about 4.6 TB on one machine (due to
>>>> potential
>>>> past problems with other machines). The machine has a disk of 6TB.
>>>>
>>>> The data folder on this machine has 59,289 files totally 4.6 TB. The
>>>> files
>>>> are the data, filter and indexes. I see that anti-compaction is running.
>>>> I
>>>> applied a recent patch which does not do anti-compaction if disk space
>>>> is
>>>> limited. I still see it happening. I have also called nodetool
>>>> loadbalance
>>>> on this machine. Seems like it will run out of disk space anyway.
>>>>
>>>> The machine diskspace consumed are: (Each machine has a 6TB hard-drive
>>>> on
>>>> RAID).
>>>>
>>>> Machine Space Consumed
>>>> M1    4.47 TB
>>>> M2    2.93 TB
>>>> M3    1.83 GB
>>>> M4    56.19 GB
>>>> M5    398.01 GB
>>>>
>>>> How can I force M1 to immediately move its load to M3 and M4 for
>>>> instance
>>>> (or any other machine). The nodetool move command moves all data, is
>>>> there
>>>> a
>>>> way instead to force move 50% of data to M3 and the remaining 50% to M4
>>>> and
>>>> resume anti-compaction after the move?
>>>>
>>>> Thanks, Shiv
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>

Re: Anti-compaction Diskspace issue even when latest patch applied

Posted by Jonathan Ellis <jb...@gmail.com>.

https://issues.apache.org/jira/browse/CASSANDRA-579 should make a big
difference in speed.  If you want to take a stab at it I can point you
in the right direction. :)

On Thu, Mar 4, 2010 at 10:24 AM, shiv shivaji <sh...@yahoo.com> wrote:
> Yes.
>
> The IP change trick seems to work. Load balancing seems a little slow, but I
> will open a new thread on that if needed.
>
> Thanks, Shiv
>
>
> ________________________________
> From: Jonathan Ellis <jb...@gmail.com>
> To: cassandra-user@incubator.apache.org
> Sent: Wed, March 3, 2010 9:21:28 AM
> Subject: Re: Anti-compaction Diskspace issue even when latest patch applied
>
> You are proposing manually moving your data from a 5TB disk to a 12TB
> disk, and that is the only change you want to make?  Then just keep
> the IP the same when you restart it after moving, and you won't have
> to do anything else, it will just look like the node was down
> temporarily and is now back up.
>
> On Tue, Mar 2, 2010 at 7:26 PM, shiv shivaji <sh...@yahoo.com> wrote:
>> Thanks, just realized this after looking at the source code.
>>
>> Seems like the decommission will not work for me due to disk space issues.
>> I
>> am currently moving all the data on the heavy node (5 TB full) to a 12 TB
>> disk drive. I am planning to remove the old token and resign the old token
>> to this node.
>>
>> According to the docs, it says to use decommission, however lack of disk
>> space does not allow me to do this. If I manually move all the data files
>> and then do a removetoken and start the node with a new token, would that
>> work (as was implied in a JIRA)?
>>
>> Shiv
>>
>>
>> ________________________________
>> From: Stu Hood <st...@rackspace.com>
>> To: cassandra-user@incubator.apache.org
>> Sent: Sun, February 28, 2010 1:53:29 PM
>> Subject: Re: Anti-compaction Diskspace issue even when latest patch
>> applied
>>
>> `nodetool cleanup` is a very expensive process: it performs a major
>> compaction, and should not be done that frequently.
>>
>> -----Original Message-----
>> From: "shiv shivaji" <sh...@yahoo.com>
>> Sent: Sunday, February 28, 2010 3:34pm
>> To: cassandra-user@incubator.apache.org
>> Subject: Re: Anti-compaction Diskspace issue even when latest patch
>> applied
>>
>> Seems like the temporary solution was to run a cron job which calls
>> nodetool
>> cleanup every 5 mins or so. This stopped the disk space from going too
>> low.
>>
>> The manual solution you mentioned is likely worthy of consideration as the
>> load balancing is taking a while.
>>
>> I will track the jira issue of anticompaction and diskspace. Thanks for
>> the
>> pointer.
>>
>>
>> Thanks, Shiv
>>
>>
>>
>>
>> ________________________________
>> From: Jonathan Ellis <jb...@gmail.com>
>> To: cassandra-user@incubator.apache.org
>> Sent: Wed, February 24, 2010 11:34:59 AM
>> Subject: Re: Anti-compaction Diskspace issue even when latest patch
>> applied
>>
>> as you noticed, "nodeprobe move" first unloads the data, then moves to
>> the new position.  so that won't help you here.
>>
>> If you are using replicationfactor=1, scp the data to the previous
>> node on the ring, then reduce the original node's token so it isn't
>> responsible for so much, and run cleanup.  (you can do this w/ higher
>> RF too, you just have to scp the data more places.)
>>
>> Finally, you could work on
>> https://issues.apache.org/jira/browse/CASSANDRA-579 so it doesn't need
>> to anticompact to disk before moving data.
>>
>> -Jonathan
>>
>> On Wed, Feb 24, 2010 at 12:06 PM, shiv shivaji <sh...@yahoo.com>
>> wrote:
>>> According to the stack trace I get in the log, it makes it look like the
>>> patch was for anti-compaction but I did not look at the source code in
>>> detail yet.
>>>
>>> java.util.concurrent.ExecutionException:
>>> java.lang.UnsupportedOperationException: disk full
>>>        at
>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>>>        at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>>>        at
>>>
>>>
>>> org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.afterExecute(DebuggableThreadPoolExecutor.java:86)
>>>        at
>>>
>>>
>>> org.apache.cassandra.db.CompactionManager$CompactionExecutor.afterExecute(CompactionManager.java:570)
>>>        at
>>>
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:888)
>>>        at
>>>
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>        at java.lang.Thread.run(Thread.java:619)
>>> Caused by: java.lang.UnsupportedOperationException: disk full
>>>        at
>>>
>>>
>>> org.apache.cassandra.db.CompactionManager.doAntiCompaction(CompactionManager.java:344)
>>>        at
>>>
>>>
>>> org.apache.cassandra.db.CompactionManager.doCleanupCompaction(CompactionManager.java:405)
>>>        at
>>>
>>>
>>> org.apache.cassandra.db.CompactionManager.access$400(CompactionManager.java:49)
>>>        at
>>>
>>>
>>> org.apache.cassandra.db.CompactionManager$2.call(CompactionManager.java:130)
>>>        at
>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>>        at
>>>
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>        ... 2 more
>>>
>>> I tried "nodetool cleanup" before and it did not really stop the disk
>>> from
>>> filling, is there a way to force move the data or some other way to solve
>>> the issue?
>>>
>>> Thanks, Shiv
>>>
>>> ________________________________
>>> From: Jonathan Ellis <jb...@gmail.com>
>>> To: cassandra-user@incubator.apache.org
>>> Sent: Wed, February 24, 2010 7:16:32 AM
>>> Subject: Re: Anti-compaction Diskspace issue even when latest patch
>>> applied
>>>
>>> The patch you refer to was to help *compaction*, not *anticompaction*.
>>>
>>> If the space is mostly hints for other machines (is that what you
>>> meant by "due to past problems with others?") you should run nodeprobe
>>> cleanup on it to remove data that doesn't actually belong on that
>>> node.
>>>
>>> -Jonathan
>>>
>>> On Wed, Feb 24, 2010 at 3:09 AM, shiv shivaji <sh...@yahoo.com>
>>> wrote:
>>>> For about 6TB of  total data size with a replication factor of 2 (6TB x
>>>> 2)
>>>> on a five node cluster, I see about 4.6 TB on one machine (due to
>>>> potential
>>>> past problems with other machines). The machine has a disk of 6TB.
>>>>
>>>> The data folder on this machine has 59,289 files totally 4.6 TB. The
>>>> files
>>>> are the data, filter and indexes. I see that anti-compaction is running.
>>>> I
>>>> applied a recent patch which does not do anti-compaction if disk space
>>>> is
>>>> limited. I still see it happening. I have also called nodetool
>>>> loadbalance
>>>> on this machine. Seems like it will run out of disk space anyway.
>>>>
>>>> The machine diskspace consumed are: (Each machine has a 6TB hard-drive
>>>> on
>>>> RAID).
>>>>
>>>> Machine Space Consumed
>>>> M1    4.47 TB
>>>> M2    2.93 TB
>>>> M3    1.83 GB
>>>> M4    56.19 GB
>>>> M5    398.01 GB
>>>>
>>>> How can I force M1 to immediately move its load to M3 and M4 for
>>>> instance
>>>> (or any other machine). The nodetool move command moves all data, is
>>>> there
>>>> a
>>>> way instead to force move 50% of data to M3 and the remaining 50% to M4
>>>> and
>>>> resume anti-compaction after the move?
>>>>
>>>> Thanks, Shiv
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>

Re: Anti-compaction Diskspace issue even when latest patch applied

Posted by shiv shivaji <sh...@yahoo.com>.

Yes. 

The IP change trick seems to work. Load balancing seems a little slow, but I will open a new thread on that if needed.

Thanks, Shiv





________________________________
From: Jonathan Ellis <jb...@gmail.com>
To: cassandra-user@incubator.apache.org
Sent: Wed, March 3, 2010 9:21:28 AM
Subject: Re: Anti-compaction Diskspace issue even when latest patch applied

You are proposing manually moving your data from a 5TB disk to a 12TB
disk, and that is the only change you want to make?  Then just keep
the IP the same when you restart it after moving, and you won't have
to do anything else, it will just look like the node was down
temporarily and is now back up.

On Tue, Mar 2, 2010 at 7:26 PM, shiv shivaji <sh...@yahoo.com> wrote:
> Thanks, just realized this after looking at the source code.
>
> Seems like the decommission will not work for me due to disk space issues. I
> am currently moving all the data on the heavy node (5 TB full) to a 12 TB
> disk drive. I am planning to remove the old token and resign the old token
> to this node.
>
> According to the docs, it says to use decommission, however lack of disk
> space does not allow me to do this. If I manually move all the data files
> and then do a removetoken and start the node with a new token, would that
> work (as was implied in a JIRA)?
>
> Shiv
>
>
> ________________________________
> From: Stu Hood <st...@rackspace.com>
> To: cassandra-user@incubator.apache.org
> Sent: Sun, February 28, 2010 1:53:29 PM
> Subject: Re: Anti-compaction Diskspace issue even when latest patch applied
>
> `nodetool cleanup` is a very expensive process: it performs a major
> compaction, and should not be done that frequently.
>
> -----Original Message-----
> From: "shiv shivaji" <sh...@yahoo.com>
> Sent: Sunday, February 28, 2010 3:34pm
> To: cassandra-user@incubator.apache.org
> Subject: Re: Anti-compaction Diskspace issue even when latest patch applied
>
> Seems like the temporary solution was to run a cron job which calls nodetool
> cleanup every 5 mins or so. This stopped the disk space from going too low.
>
> The manual solution you mentioned is likely worthy of consideration as the
> load balancing is taking a while.
>
> I will track the jira issue of anticompaction and diskspace. Thanks for the
> pointer.
>
>
> Thanks, Shiv
>
>
>
>
> ________________________________
> From: Jonathan Ellis <jb...@gmail.com>
> To: cassandra-user@incubator.apache.org
> Sent: Wed, February 24, 2010 11:34:59 AM
> Subject: Re: Anti-compaction Diskspace issue even when latest patch applied
>
> as you noticed, "nodeprobe move" first unloads the data, then moves to
> the new position.  so that won't help you here.
>
> If you are using replicationfactor=1, scp the data to the previous
> node on the ring, then reduce the original node's token so it isn't
> responsible for so much, and run cleanup.  (you can do this w/ higher
> RF too, you just have to scp the data more places.)
>
> Finally, you could work on
> https://issues.apache.org/jira/browse/CASSANDRA-579 so it doesn't need
> to anticompact to disk before moving data.
>
> -Jonathan
>
> On Wed, Feb 24, 2010 at 12:06 PM, shiv shivaji <sh...@yahoo.com> wrote:
>> According to the stack trace I get in the log, it makes it look like the
>> patch was for anti-compaction but I did not look at the source code in
>> detail yet.
>>
>> java.util.concurrent.ExecutionException:
>> java.lang.UnsupportedOperationException: disk full
>>        at
>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>>        at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>>        at
>>
>> org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.afterExecute(DebuggableThreadPoolExecutor.java:86)
>>        at
>>
>> org.apache.cassandra.db.CompactionManager$CompactionExecutor.afterExecute(CompactionManager.java:570)
>>        at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:888)
>>        at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>        at java.lang.Thread.run(Thread.java:619)
>> Caused by: java.lang.UnsupportedOperationException: disk full
>>        at
>>
>> org.apache.cassandra.db.CompactionManager.doAntiCompaction(CompactionManager.java:344)
>>        at
>>
>> org.apache.cassandra.db.CompactionManager.doCleanupCompaction(CompactionManager.java:405)
>>        at
>>
>> org.apache.cassandra.db.CompactionManager.access$400(CompactionManager.java:49)
>>        at
>>
>> org.apache.cassandra.db.CompactionManager$2.call(CompactionManager.java:130)
>>        at
>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>        at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>        ... 2 more
>>
>> I tried "nodetool cleanup" before and it did not really stop the disk from
>> filling, is there a way to force move the data or some other way to solve
>> the issue?
>>
>> Thanks, Shiv
>>
>> ________________________________
>> From: Jonathan Ellis <jb...@gmail.com>
>> To: cassandra-user@incubator.apache.org
>> Sent: Wed, February 24, 2010 7:16:32 AM
>> Subject: Re: Anti-compaction Diskspace issue even when latest patch
>> applied
>>
>> The patch you refer to was to help *compaction*, not *anticompaction*.
>>
>> If the space is mostly hints for other machines (is that what you
>> meant by "due to past problems with others?") you should run nodeprobe
>> cleanup on it to remove data that doesn't actually belong on that
>> node.
>>
>> -Jonathan
>>
>> On Wed, Feb 24, 2010 at 3:09 AM, shiv shivaji <sh...@yahoo.com>
>> wrote:
>>> For about 6TB of  total data size with a replication factor of 2 (6TB x
>>> 2)
>>> on a five node cluster, I see about 4.6 TB on one machine (due to
>>> potential
>>> past problems with other machines). The machine has a disk of 6TB.
>>>
>>> The data folder on this machine has 59,289 files totally 4.6 TB. The
>>> files
>>> are the data, filter and indexes. I see that anti-compaction is running.
>>> I
>>> applied a recent patch which does not do anti-compaction if disk space is
>>> limited. I still see it happening. I have also called nodetool
>>> loadbalance
>>> on this machine. Seems like it will run out of disk space anyway.
>>>
>>> The machine diskspace consumed are: (Each machine has a 6TB hard-drive on
>>> RAID).
>>>
>>> Machine Space Consumed
>>> M1    4.47 TB
>>> M2    2.93 TB
>>> M3    1.83 GB
>>> M4    56.19 GB
>>> M5    398.01 GB
>>>
>>> How can I force M1 to immediately move its load to M3 and M4 for instance
>>> (or any other machine). The nodetool move command moves all data, is
>>> there
>>> a
>>> way instead to force move 50% of data to M3 and the remaining 50% to M4
>>> and
>>> resume anti-compaction after the move?
>>>
>>> Thanks, Shiv
>>>
>>>
>>>
>>
>
>
>

Re: Anti-compaction Diskspace issue even when latest patch applied

Posted by Jonathan Ellis <jb...@gmail.com>.

You are proposing manually moving your data from a 5TB disk to a 12TB
disk, and that is the only change you want to make?  Then just keep
the IP the same when you restart it after moving, and you won't have
to do anything else, it will just look like the node was down
temporarily and is now back up.

On Tue, Mar 2, 2010 at 7:26 PM, shiv shivaji <sh...@yahoo.com> wrote:
> Thanks, just realized this after looking at the source code.
>
> Seems like the decommission will not work for me due to disk space issues. I
> am currently moving all the data on the heavy node (5 TB full) to a 12 TB
> disk drive. I am planning to remove the old token and resign the old token
> to this node.
>
> According to the docs, it says to use decommission, however lack of disk
> space does not allow me to do this. If I manually move all the data files
> and then do a removetoken and start the node with a new token, would that
> work (as was implied in a JIRA)?
>
> Shiv
>
>
> ________________________________
> From: Stu Hood <st...@rackspace.com>
> To: cassandra-user@incubator.apache.org
> Sent: Sun, February 28, 2010 1:53:29 PM
> Subject: Re: Anti-compaction Diskspace issue even when latest patch applied
>
> `nodetool cleanup` is a very expensive process: it performs a major
> compaction, and should not be done that frequently.
>
> -----Original Message-----
> From: "shiv shivaji" <sh...@yahoo.com>
> Sent: Sunday, February 28, 2010 3:34pm
> To: cassandra-user@incubator.apache.org
> Subject: Re: Anti-compaction Diskspace issue even when latest patch applied
>
> Seems like the temporary solution was to run a cron job which calls nodetool
> cleanup every 5 mins or so. This stopped the disk space from going too low.
>
> The manual solution you mentioned is likely worthy of consideration as the
> load balancing is taking a while.
>
> I will track the jira issue of anticompaction and diskspace. Thanks for the
> pointer.
>
>
> Thanks, Shiv
>
>
>
>
> ________________________________
> From: Jonathan Ellis <jb...@gmail.com>
> To: cassandra-user@incubator.apache.org
> Sent: Wed, February 24, 2010 11:34:59 AM
> Subject: Re: Anti-compaction Diskspace issue even when latest patch applied
>
> as you noticed, "nodeprobe move" first unloads the data, then moves to
> the new position.  so that won't help you here.
>
> If you are using replicationfactor=1, scp the data to the previous
> node on the ring, then reduce the original node's token so it isn't
> responsible for so much, and run cleanup.  (you can do this w/ higher
> RF too, you just have to scp the data more places.)
>
> Finally, you could work on
> https://issues.apache.org/jira/browse/CASSANDRA-579 so it doesn't need
> to anticompact to disk before moving data.
>
> -Jonathan
>
> On Wed, Feb 24, 2010 at 12:06 PM, shiv shivaji <sh...@yahoo.com> wrote:
>> According to the stack trace I get in the log, it makes it look like the
>> patch was for anti-compaction but I did not look at the source code in
>> detail yet.
>>
>> java.util.concurrent.ExecutionException:
>> java.lang.UnsupportedOperationException: disk full
>>        at
>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>>        at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>>        at
>>
>> org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.afterExecute(DebuggableThreadPoolExecutor.java:86)
>>        at
>>
>> org.apache.cassandra.db.CompactionManager$CompactionExecutor.afterExecute(CompactionManager.java:570)
>>        at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:888)
>>        at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>        at java.lang.Thread.run(Thread.java:619)
>> Caused by: java.lang.UnsupportedOperationException: disk full
>>        at
>>
>> org.apache.cassandra.db.CompactionManager.doAntiCompaction(CompactionManager.java:344)
>>        at
>>
>> org.apache.cassandra.db.CompactionManager.doCleanupCompaction(CompactionManager.java:405)
>>        at
>>
>> org.apache.cassandra.db.CompactionManager.access$400(CompactionManager.java:49)
>>        at
>>
>> org.apache.cassandra.db.CompactionManager$2.call(CompactionManager.java:130)
>>        at
>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>        at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>        ... 2 more
>>
>> I tried "nodetool cleanup" before and it did not really stop the disk from
>> filling, is there a way to force move the data or some other way to solve
>> the issue?
>>
>> Thanks, Shiv
>>
>> ________________________________
>> From: Jonathan Ellis <jb...@gmail.com>
>> To: cassandra-user@incubator.apache.org
>> Sent: Wed, February 24, 2010 7:16:32 AM
>> Subject: Re: Anti-compaction Diskspace issue even when latest patch
>> applied
>>
>> The patch you refer to was to help *compaction*, not *anticompaction*.
>>
>> If the space is mostly hints for other machines (is that what you
>> meant by "due to past problems with others?") you should run nodeprobe
>> cleanup on it to remove data that doesn't actually belong on that
>> node.
>>
>> -Jonathan
>>
>> On Wed, Feb 24, 2010 at 3:09 AM, shiv shivaji <sh...@yahoo.com>
>> wrote:
>>> For about 6TB of  total data size with a replication factor of 2 (6TB x
>>> 2)
>>> on a five node cluster, I see about 4.6 TB on one machine (due to
>>> potential
>>> past problems with other machines). The machine has a disk of 6TB.
>>>
>>> The data folder on this machine has 59,289 files totally 4.6 TB. The
>>> files
>>> are the data, filter and indexes. I see that anti-compaction is running.
>>> I
>>> applied a recent patch which does not do anti-compaction if disk space is
>>> limited. I still see it happening. I have also called nodetool
>>> loadbalance
>>> on this machine. Seems like it will run out of disk space anyway.
>>>
>>> The machine diskspace consumed are: (Each machine has a 6TB hard-drive on
>>> RAID).
>>>
>>> Machine Space Consumed
>>> M1    4.47 TB
>>> M2    2.93 TB
>>> M3    1.83 GB
>>> M4    56.19 GB
>>> M5    398.01 GB
>>>
>>> How can I force M1 to immediately move its load to M3 and M4 for instance
>>> (or any other machine). The nodetool move command moves all data, is
>>> there
>>> a
>>> way instead to force move 50% of data to M3 and the remaining 50% to M4
>>> and
>>> resume anti-compaction after the move?
>>>
>>> Thanks, Shiv
>>>
>>>
>>>
>>
>
>
>