You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Michael Kjellman <mk...@barracuda.com> on 2012/09/17 21:39:56 UTC

persistent compaction issue (1.1.4 and 1.1.5)

Hi All,

I have an issue where each one of my nodes (currently all running at 1.1.5) is reporting around 30,000 pending compactions. I understand that a pending compaction doesn't necessarily mean it is a scheduled task however I'm confused why this behavior is occurring. It is the same on all nodes, occasionally goes down 5k pending compaction tasks, and then returns to 25,000-35,000 compaction tasks pending.

I have tried a repair operation/scrub operation on two of the nodes and while compactions initially happen the number of pending compactions does not decrease.

Any ideas? Thanks for your time.

Best,
michael


'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks solutions.
Visit http://barracudanetworks.com/facebook



Re: persistent compaction issue (1.1.4 and 1.1.5)

Posted by Michael Kjellman <mk...@barracuda.com>.
There are a large number of members in generation 0, which I'm assuming refers to L0 according to a few of the .json files I checked in my largest column families.

This particular node I'm checking I have already tried a scrub and repair. What steps should I take to move these SSTables to the next level? Looking like these compactions are indeed legit then…

Thank you.

From: Ben Coverston <be...@datastax.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: persistent compaction issue (1.1.4 and 1.1.5)

In your data directory there should be a .json file for each column family that holds the manifest.

Do any of those indicate that you have a large number of SSTables in L0?

This number is also indicated in JMX by the UnLeveledSSTables count for each column family.

If not it's possible that the number is coming from SSTable fragments that are the result of a repair/decommission.



On Tue, Sep 18, 2012 at 8:57 AM, Michael Kjellman <mk...@barracuda.com>> wrote:
Leveled. nothing in the logs. Normal compactions seem to be occurring...these ones just won't go away.

I've tried a rolling restart and literally tries killing our entire cluster and bringing up one node at a time in case gossip was causing this. Same result.

The compactions are there immediately after Thrift starts listening for clients.

Thanks Aaron

On Sep 18, 2012, at 3:54 AM, "aaron morton" <aa...@thelastpickle.com>>> wrote:

What Compaction Strategy are you using ?
Are there any errors in the logs ?
If you restart a node how long does it take for the numbers to start to rise ?

Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 18/09/2012, at 7:39 AM, Michael Kjellman <mk...@barracuda.com>>> wrote:

Hi All,

I have an issue where each one of my nodes (currently all running at 1.1.5) is reporting around 30,000 pending compactions. I understand that a pending compaction doesn't necessarily mean it is a scheduled task however I'm confused why this behavior is occurring. It is the same on all nodes, occasionally goes down 5k pending compaction tasks, and then returns to 25,000-35,000 compaction tasks pending.

I have tried a repair operation/scrub operation on two of the nodes and while compactions initially happen the number of pending compactions does not decrease.

Any ideas? Thanks for your time.

Best,
michael


'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks solutions.
Visit http://barracudanetworks.com/facebook




'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks solutions.
Visit http://barracudanetworks.com/facebook





--
Ben Coverston
DataStax -- The Apache Cassandra Company


'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks solutions.
Visit http://barracudanetworks.com/facebook



Re: persistent compaction issue (1.1.4 and 1.1.5)

Posted by Ben Coverston <be...@datastax.com>.
In your data directory there should be a .json file for each column family
that holds the manifest.

Do any of those indicate that you have a large number of SSTables in L0?

This number is also indicated in JMX by the UnLeveledSSTables count for
each column family.

If not it's possible that the number is coming from SSTable fragments that
are the result of a repair/decommission.



On Tue, Sep 18, 2012 at 8:57 AM, Michael Kjellman
<mk...@barracuda.com>wrote:

> Leveled. nothing in the logs. Normal compactions seem to be
> occurring...these ones just won't go away.
>
> I've tried a rolling restart and literally tries killing our entire
> cluster and bringing up one node at a time in case gossip was causing this.
> Same result.
>
> The compactions are there immediately after Thrift starts listening for
> clients.
>
> Thanks Aaron
>
> On Sep 18, 2012, at 3:54 AM, "aaron morton" <aaron@thelastpickle.com
> <ma...@thelastpickle.com>> wrote:
>
> What Compaction Strategy are you using ?
> Are there any errors in the logs ?
> If you restart a node how long does it take for the numbers to start to
> rise ?
>
> Cheers
>
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 18/09/2012, at 7:39 AM, Michael Kjellman <mkjellman@barracuda.com
> <ma...@barracuda.com>> wrote:
>
> Hi All,
>
> I have an issue where each one of my nodes (currently all running at
> 1.1.5) is reporting around 30,000 pending compactions. I understand that a
> pending compaction doesn't necessarily mean it is a scheduled task however
> I'm confused why this behavior is occurring. It is the same on all nodes,
> occasionally goes down 5k pending compaction tasks, and then returns to
> 25,000-35,000 compaction tasks pending.
>
> I have tried a repair operation/scrub operation on two of the nodes and
> while compactions initially happen the number of pending compactions does
> not decrease.
>
> Any ideas? Thanks for your time.
>
> Best,
> michael
>
>
> 'Like' us on Facebook for exclusive content and other resources on all
> Barracuda Networks solutions.
> Visit http://barracudanetworks.com/facebook
>
>
>
>
> 'Like' us on Facebook for exclusive content and other resources on all
> Barracuda Networks solutions.
> Visit http://barracudanetworks.com/facebook
>
>
>


-- 
Ben Coverston
DataStax -- The Apache Cassandra Company

Re: persistent compaction issue (1.1.4 and 1.1.5)

Posted by Michael Kjellman <mk...@barracuda.com>.
Leveled. nothing in the logs. Normal compactions seem to be occurring...these ones just won't go away.

I've tried a rolling restart and literally tries killing our entire cluster and bringing up one node at a time in case gossip was causing this. Same result.

The compactions are there immediately after Thrift starts listening for clients.

Thanks Aaron

On Sep 18, 2012, at 3:54 AM, "aaron morton" <aa...@thelastpickle.com>> wrote:

What Compaction Strategy are you using ?
Are there any errors in the logs ?
If you restart a node how long does it take for the numbers to start to rise ?

Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 18/09/2012, at 7:39 AM, Michael Kjellman <mk...@barracuda.com>> wrote:

Hi All,

I have an issue where each one of my nodes (currently all running at 1.1.5) is reporting around 30,000 pending compactions. I understand that a pending compaction doesn't necessarily mean it is a scheduled task however I'm confused why this behavior is occurring. It is the same on all nodes, occasionally goes down 5k pending compaction tasks, and then returns to 25,000-35,000 compaction tasks pending.

I have tried a repair operation/scrub operation on two of the nodes and while compactions initially happen the number of pending compactions does not decrease.

Any ideas? Thanks for your time.

Best,
michael


'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks solutions.
Visit http://barracudanetworks.com/facebook




'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks solutions.
Visit http://barracudanetworks.com/facebook



Re: persistent compaction issue (1.1.4 and 1.1.5)

Posted by aaron morton <aa...@thelastpickle.com>.
What Compaction Strategy are you using ? 
Are there any errors in the logs ? 
If you restart a node how long does it take for the numbers to start to rise ?

Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 18/09/2012, at 7:39 AM, Michael Kjellman <mk...@barracuda.com> wrote:

> Hi All,
> 
> I have an issue where each one of my nodes (currently all running at 1.1.5) is reporting around 30,000 pending compactions. I understand that a pending compaction doesn't necessarily mean it is a scheduled task however I'm confused why this behavior is occurring. It is the same on all nodes, occasionally goes down 5k pending compaction tasks, and then returns to 25,000-35,000 compaction tasks pending.
> 
> I have tried a repair operation/scrub operation on two of the nodes and while compactions initially happen the number of pending compactions does not decrease.
> 
> Any ideas? Thanks for your time.
> 
> Best,
> michael
> 
> 
> 'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks solutions.
> Visit http://barracudanetworks.com/facebook
> 
> 


Re: persistent compaction issue (1.1.4 and 1.1.5)

Posted by Michael Kjellman <mk...@barracuda.com>.
Ended up switching the biggest offending column families back to size tiered compaction and pending compactions across the cluster dropped to 0 very quickly.

On Sep 19, 2012, at 10:55 PM, "Michael Kjellman" <mk...@barracuda.com> wrote:

> After changing my ss_table_size as recommended my pending compactions across the cluster have leveled off at 34808 but it isn't progressing after 24 hours at that level.
> 
> As I've already changed the most offending column families I think the only option I have left is to remove the .json files from all of the column families and do another rolling restart...
> 
> Developing... Thanks for the help so far
> 
> On Sep 19, 2012, at 10:35 PM, "Віталій Тимчишин" <ti...@gmail.com>> wrote:
> 
> I did see problems with schema agreement on 1.1.4, but they did go away after rolling restart (BTW: it would be still good to check describe schema for unreachable). Same rolling restart helped to force compactions after moving to Leveled compaction. If your compactions still don't go, you can try removing *.json files from the data directory of the stopped node to force moving all SSTables to level0.
> 
> Best regards, Vitalii Tymchyshyn
> 
> 2012/9/19 Michael Kjellman <mk...@barracuda.com>>
> Potentially the pending compactions are a symptom and not the root
> cause/problem.
> 
> When updating a 3rd column family with a larger sstable_size_in_mb it
> looks like the schema may not be in a good state
> 
> [default@xxxx] UPDATE COLUMN FAMILY screenshots WITH
> compaction_strategy=LeveledCompactionStrategy AND
> compaction_strategy_options={sstable_size_in_mb: 200};
> 290cf619-57b0-3ad1-9ae3-e313290de9c9
> Waiting for schema agreement...
> Warning: unreachable nodes 10.8.30.102The schema has not settled in 10
> seconds; further migrations are ill-advised until it does.
> Versions are UNREACHABLE:[10.8.30.102],
> 290cf619-57b0-3ad1-9ae3-e313290de9c9:[10.8.30.15, 10.8.30.14, 10.8.30.13,
> 10.8.30.103, 10.8.30.104, 10.8.30.105, 10.8.30.106],
> f1de54f5-8830-31a6-9cdd-aaa6220cccd1:[10.8.30.101]
> 
> 
> However, tpstats looks good. And the schema changes eventually do get
> applied on *all* the nodes (even the ones that seem to have different
> schema versions). There are no communications issues between the nodes and
> they are all in the same rack
> 
> root@xxxx:~# nodetool tpstats
> Pool Name                    Active   Pending      Completed   Blocked
> All time blocked
> ReadStage                         0         0        1254592         0
>            0
> RequestResponseStage              0         0        9480827         0
>            0
> MutationStage                     0         0        8662263         0
>            0
> ReadRepairStage                   0         0         339158         0
>            0
> ReplicateOnWriteStage             0         0              0         0
>            0
> GossipStage                       0         0        1469197         0
>            0
> AntiEntropyStage                  0         0              0         0
>            0
> MigrationStage                    0         0           1808         0
>            0
> MemtablePostFlusher               0         0            248         0
>            0
> StreamStage                       0         0              0         0
>            0
> FlushWriter                       0         0            248         0
>            4
> MiscStage                         0         0              0         0
>            0
> commitlog_archiver                0         0              0         0
>            0
> InternalResponseStage             0         0           5286         0
>            0
> HintedHandoff                     0         0             21         0
>            0
> 
> Message type           Dropped
> RANGE_SLICE                  0
> READ_REPAIR                  0
> BINARY                       0
> READ                         0
> MUTATION                     0
> REQUEST_RESPONSE             0
> 
> So I'm guessing maybe the different schema versions may be potentially
> stopping compactions? Will compactions still happen if there are different
> versions of the schema?
> 
> 
> 
> 
> 
> On 9/18/12 11:38 AM, "Michael Kjellman" <mk...@barracuda.com>> wrote:
> 
>> Thanks, I just modified the schema on the worse offending column family
>> (as determined by the .json) from 10MB to 200MB.
>> 
>> Should I kick off a compaction on this cf now/repair?/scrub?
>> 
>> Thanks
>> 
>> -michael
>> 
>> From: Віталій Тимчишин <ti...@gmail.com>>>
>> Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>>"
>> <us...@cassandra.apache.org>>>
>> To: "user@cassandra.apache.org<ma...@cassandra.apache.org>>"
>> <us...@cassandra.apache.org>>>
>> Subject: Re: persistent compaction issue (1.1.4 and 1.1.5)
>> 
>> I've started to use LeveledCompaction some time ago and from my
>> experience this indicates some SST on lower levels than they should be.
>> The compaction is going, moving them up level by level, but total count
>> does not change as new data goes in.
>> The numbers are pretty high as for me. Such numbers mean a lot of files
>> (over 100K in single directory) and a lot of thinking for compaction
>> executor to decide what to compact next. I can see numbers like 5K-10K
>> and still thing this is high number. If I were you, I'd increase
>> sstable_size_in_mb 10-20 times it is now.
>> 
>> 2012/9/17 Michael Kjellman
>> <mk...@barracuda.com>>>
>> Hi All,
>> 
>> I have an issue where each one of my nodes (currently all running at
>> 1.1.5) is reporting around 30,000 pending compactions. I understand that
>> a pending compaction doesn't necessarily mean it is a scheduled task
>> however I'm confused why this behavior is occurring. It is the same on
>> all nodes, occasionally goes down 5k pending compaction tasks, and then
>> returns to 25,000-35,000 compaction tasks pending.
>> 
>> I have tried a repair operation/scrub operation on two of the nodes and
>> while compactions initially happen the number of pending compactions does
>> not decrease.
>> 
>> Any ideas? Thanks for your time.
>> 
>> Best,
>> michael
>> 
>> 
>> 'Like' us on Facebook for exclusive content and other resources on all
>> Barracuda Networks solutions.
>> 
>> Visit http://barracudanetworks.com/facebook
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> --
>> Best regards,
>> Vitalii Tymchyshyn
>> 
>> 'Like' us on Facebook for exclusive content and other resources on all
>> Barracuda Networks solutions.
>> 
>> Visit http://barracudanetworks.com/facebook
> 
> 
> 'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks solutions.
> 
> Visit http://barracudanetworks.com/facebook
> 
> 
> 
> 
> 
> 
> 
> --
> Best regards,
> Vitalii Tymchyshyn
> 
> 'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks solutions.
> Visit http://barracudanetworks.com/facebook
> 
> 

'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks solutions.
Visit http://barracudanetworks.com/facebook



Re: persistent compaction issue (1.1.4 and 1.1.5)

Posted by Michael Kjellman <mk...@barracuda.com>.
After changing my ss_table_size as recommended my pending compactions across the cluster have leveled off at 34808 but it isn't progressing after 24 hours at that level.

As I've already changed the most offending column families I think the only option I have left is to remove the .json files from all of the column families and do another rolling restart...

Developing... Thanks for the help so far

On Sep 19, 2012, at 10:35 PM, "Віталій Тимчишин" <ti...@gmail.com>> wrote:

I did see problems with schema agreement on 1.1.4, but they did go away after rolling restart (BTW: it would be still good to check describe schema for unreachable). Same rolling restart helped to force compactions after moving to Leveled compaction. If your compactions still don't go, you can try removing *.json files from the data directory of the stopped node to force moving all SSTables to level0.

Best regards, Vitalii Tymchyshyn

2012/9/19 Michael Kjellman <mk...@barracuda.com>>
Potentially the pending compactions are a symptom and not the root
cause/problem.

When updating a 3rd column family with a larger sstable_size_in_mb it
looks like the schema may not be in a good state

[default@xxxx] UPDATE COLUMN FAMILY screenshots WITH
compaction_strategy=LeveledCompactionStrategy AND
compaction_strategy_options={sstable_size_in_mb: 200};
290cf619-57b0-3ad1-9ae3-e313290de9c9
Waiting for schema agreement...
Warning: unreachable nodes 10.8.30.102The schema has not settled in 10
seconds; further migrations are ill-advised until it does.
Versions are UNREACHABLE:[10.8.30.102],
290cf619-57b0-3ad1-9ae3-e313290de9c9:[10.8.30.15, 10.8.30.14, 10.8.30.13,
10.8.30.103, 10.8.30.104, 10.8.30.105, 10.8.30.106],
f1de54f5-8830-31a6-9cdd-aaa6220cccd1:[10.8.30.101]


However, tpstats looks good. And the schema changes eventually do get
applied on *all* the nodes (even the ones that seem to have different
schema versions). There are no communications issues between the nodes and
they are all in the same rack

root@xxxx:~# nodetool tpstats
Pool Name                    Active   Pending      Completed   Blocked
All time blocked
ReadStage                         0         0        1254592         0
            0
RequestResponseStage              0         0        9480827         0
            0
MutationStage                     0         0        8662263         0
            0
ReadRepairStage                   0         0         339158         0
            0
ReplicateOnWriteStage             0         0              0         0
            0
GossipStage                       0         0        1469197         0
            0
AntiEntropyStage                  0         0              0         0
            0
MigrationStage                    0         0           1808         0
            0
MemtablePostFlusher               0         0            248         0
            0
StreamStage                       0         0              0         0
            0
FlushWriter                       0         0            248         0
            4
MiscStage                         0         0              0         0
            0
commitlog_archiver                0         0              0         0
            0
InternalResponseStage             0         0           5286         0
            0
HintedHandoff                     0         0             21         0
            0

Message type           Dropped
RANGE_SLICE                  0
READ_REPAIR                  0
BINARY                       0
READ                         0
MUTATION                     0
REQUEST_RESPONSE             0

So I'm guessing maybe the different schema versions may be potentially
stopping compactions? Will compactions still happen if there are different
versions of the schema?





On 9/18/12 11:38 AM, "Michael Kjellman" <mk...@barracuda.com>> wrote:

>Thanks, I just modified the schema on the worse offending column family
>(as determined by the .json) from 10MB to 200MB.
>
>Should I kick off a compaction on this cf now/repair?/scrub?
>
>Thanks
>
>-michael
>
>From: Віталій Тимчишин <ti...@gmail.com>>>
>Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>>"
><us...@cassandra.apache.org>>>
>To: "user@cassandra.apache.org<ma...@cassandra.apache.org>>"
><us...@cassandra.apache.org>>>
>Subject: Re: persistent compaction issue (1.1.4 and 1.1.5)
>
>I've started to use LeveledCompaction some time ago and from my
>experience this indicates some SST on lower levels than they should be.
>The compaction is going, moving them up level by level, but total count
>does not change as new data goes in.
>The numbers are pretty high as for me. Such numbers mean a lot of files
>(over 100K in single directory) and a lot of thinking for compaction
>executor to decide what to compact next. I can see numbers like 5K-10K
>and still thing this is high number. If I were you, I'd increase
>sstable_size_in_mb 10-20 times it is now.
>
>2012/9/17 Michael Kjellman
><mk...@barracuda.com>>>
>Hi All,
>
>I have an issue where each one of my nodes (currently all running at
>1.1.5) is reporting around 30,000 pending compactions. I understand that
>a pending compaction doesn't necessarily mean it is a scheduled task
>however I'm confused why this behavior is occurring. It is the same on
>all nodes, occasionally goes down 5k pending compaction tasks, and then
>returns to 25,000-35,000 compaction tasks pending.
>
>I have tried a repair operation/scrub operation on two of the nodes and
>while compactions initially happen the number of pending compactions does
>not decrease.
>
>Any ideas? Thanks for your time.
>
>Best,
>michael
>
>
>'Like' us on Facebook for exclusive content and other resources on all
>Barracuda Networks solutions.
>
>Visit http://barracudanetworks.com/facebook
>
>
>
>
>
>
>
>--
>Best regards,
> Vitalii Tymchyshyn
>
>'Like' us on Facebook for exclusive content and other resources on all
>Barracuda Networks solutions.
>
>Visit http://barracudanetworks.com/facebook
>
>
>
>


'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks solutions.

Visit http://barracudanetworks.com/facebook







--
Best regards,
 Vitalii Tymchyshyn

'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks solutions.
Visit http://barracudanetworks.com/facebook



Re: persistent compaction issue (1.1.4 and 1.1.5)

Posted by Віталій Тимчишин <ti...@gmail.com>.
I did see problems with schema agreement on 1.1.4, but they did go away
after rolling restart (BTW: it would be still good to check describe schema
for unreachable). Same rolling restart helped to force compactions after
moving to Leveled compaction. If your compactions still don't go, you can
try removing *.json files from the data directory of the stopped node to
force moving all SSTables to level0.

Best regards, Vitalii Tymchyshyn

2012/9/19 Michael Kjellman <mk...@barracuda.com>

> Potentially the pending compactions are a symptom and not the root
> cause/problem.
>
> When updating a 3rd column family with a larger sstable_size_in_mb it
> looks like the schema may not be in a good state
>
> [default@xxxx] UPDATE COLUMN FAMILY screenshots WITH
> compaction_strategy=LeveledCompactionStrategy AND
> compaction_strategy_options={sstable_size_in_mb: 200};
> 290cf619-57b0-3ad1-9ae3-e313290de9c9
> Waiting for schema agreement...
> Warning: unreachable nodes 10.8.30.102The schema has not settled in 10
> seconds; further migrations are ill-advised until it does.
> Versions are UNREACHABLE:[10.8.30.102],
> 290cf619-57b0-3ad1-9ae3-e313290de9c9:[10.8.30.15, 10.8.30.14, 10.8.30.13,
> 10.8.30.103, 10.8.30.104, 10.8.30.105, 10.8.30.106],
> f1de54f5-8830-31a6-9cdd-aaa6220cccd1:[10.8.30.101]
>
>
> However, tpstats looks good. And the schema changes eventually do get
> applied on *all* the nodes (even the ones that seem to have different
> schema versions). There are no communications issues between the nodes and
> they are all in the same rack
>
> root@xxxx:~# nodetool tpstats
> Pool Name                    Active   Pending      Completed   Blocked
> All time blocked
> ReadStage                         0         0        1254592         0
>             0
> RequestResponseStage              0         0        9480827         0
>             0
> MutationStage                     0         0        8662263         0
>             0
> ReadRepairStage                   0         0         339158         0
>             0
> ReplicateOnWriteStage             0         0              0         0
>             0
> GossipStage                       0         0        1469197         0
>             0
> AntiEntropyStage                  0         0              0         0
>             0
> MigrationStage                    0         0           1808         0
>             0
> MemtablePostFlusher               0         0            248         0
>             0
> StreamStage                       0         0              0         0
>             0
> FlushWriter                       0         0            248         0
>             4
> MiscStage                         0         0              0         0
>             0
> commitlog_archiver                0         0              0         0
>             0
> InternalResponseStage             0         0           5286         0
>             0
> HintedHandoff                     0         0             21         0
>             0
>
> Message type           Dropped
> RANGE_SLICE                  0
> READ_REPAIR                  0
> BINARY                       0
> READ                         0
> MUTATION                     0
> REQUEST_RESPONSE             0
>
> So I'm guessing maybe the different schema versions may be potentially
> stopping compactions? Will compactions still happen if there are different
> versions of the schema?
>
>
>
>
>
> On 9/18/12 11:38 AM, "Michael Kjellman" <mk...@barracuda.com> wrote:
>
> >Thanks, I just modified the schema on the worse offending column family
> >(as determined by the .json) from 10MB to 200MB.
> >
> >Should I kick off a compaction on this cf now/repair?/scrub?
> >
> >Thanks
> >
> >-michael
> >
> >From: Віталій Тимчишин <ti...@gmail.com>>
> >Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>"
> ><us...@cassandra.apache.org>>
> >To: "user@cassandra.apache.org<ma...@cassandra.apache.org>"
> ><us...@cassandra.apache.org>>
> >Subject: Re: persistent compaction issue (1.1.4 and 1.1.5)
> >
> >I've started to use LeveledCompaction some time ago and from my
> >experience this indicates some SST on lower levels than they should be.
> >The compaction is going, moving them up level by level, but total count
> >does not change as new data goes in.
> >The numbers are pretty high as for me. Such numbers mean a lot of files
> >(over 100K in single directory) and a lot of thinking for compaction
> >executor to decide what to compact next. I can see numbers like 5K-10K
> >and still thing this is high number. If I were you, I'd increase
> >sstable_size_in_mb 10-20 times it is now.
> >
> >2012/9/17 Michael Kjellman
> ><mk...@barracuda.com>>
> >Hi All,
> >
> >I have an issue where each one of my nodes (currently all running at
> >1.1.5) is reporting around 30,000 pending compactions. I understand that
> >a pending compaction doesn't necessarily mean it is a scheduled task
> >however I'm confused why this behavior is occurring. It is the same on
> >all nodes, occasionally goes down 5k pending compaction tasks, and then
> >returns to 25,000-35,000 compaction tasks pending.
> >
> >I have tried a repair operation/scrub operation on two of the nodes and
> >while compactions initially happen the number of pending compactions does
> >not decrease.
> >
> >Any ideas? Thanks for your time.
> >
> >Best,
> >michael
> >
> >
> >'Like' us on Facebook for exclusive content and other resources on all
> >Barracuda Networks solutions.
> >
> >Visit http://barracudanetworks.com/facebook
> >
> >
> >
> >
> >
> >
> >
> >--
> >Best regards,
> > Vitalii Tymchyshyn
> >
> >'Like' us on Facebook for exclusive content and other resources on all
> >Barracuda Networks solutions.
> >
> >Visit http://barracudanetworks.com/facebook
> >
> >
> >
> >
>
>
> 'Like' us on Facebook for exclusive content and other resources on all
> Barracuda Networks solutions.
>
> Visit http://barracudanetworks.com/facebook
>
>
>
>
>


-- 
Best regards,
 Vitalii Tymchyshyn

Re: persistent compaction issue (1.1.4 and 1.1.5)

Posted by Michael Kjellman <mk...@barracuda.com>.
Potentially the pending compactions are a symptom and not the root
cause/problem.

When updating a 3rd column family with a larger sstable_size_in_mb it
looks like the schema may not be in a good state

[default@xxxx] UPDATE COLUMN FAMILY screenshots WITH
compaction_strategy=LeveledCompactionStrategy AND
compaction_strategy_options={sstable_size_in_mb: 200};
290cf619-57b0-3ad1-9ae3-e313290de9c9
Waiting for schema agreement...
Warning: unreachable nodes 10.8.30.102The schema has not settled in 10
seconds; further migrations are ill-advised until it does.
Versions are UNREACHABLE:[10.8.30.102],
290cf619-57b0-3ad1-9ae3-e313290de9c9:[10.8.30.15, 10.8.30.14, 10.8.30.13,
10.8.30.103, 10.8.30.104, 10.8.30.105, 10.8.30.106],
f1de54f5-8830-31a6-9cdd-aaa6220cccd1:[10.8.30.101]


However, tpstats looks good. And the schema changes eventually do get
applied on *all* the nodes (even the ones that seem to have different
schema versions). There are no communications issues between the nodes and
they are all in the same rack

root@xxxx:~# nodetool tpstats
Pool Name                    Active   Pending      Completed   Blocked
All time blocked
ReadStage                         0         0        1254592         0
            0
RequestResponseStage              0         0        9480827         0
            0
MutationStage                     0         0        8662263         0
            0
ReadRepairStage                   0         0         339158         0
            0
ReplicateOnWriteStage             0         0              0         0
            0
GossipStage                       0         0        1469197         0
            0
AntiEntropyStage                  0         0              0         0
            0
MigrationStage                    0         0           1808         0
            0
MemtablePostFlusher               0         0            248         0
            0
StreamStage                       0         0              0         0
            0
FlushWriter                       0         0            248         0
            4
MiscStage                         0         0              0         0
            0
commitlog_archiver                0         0              0         0
            0
InternalResponseStage             0         0           5286         0
            0
HintedHandoff                     0         0             21         0
            0

Message type           Dropped
RANGE_SLICE                  0
READ_REPAIR                  0
BINARY                       0
READ                         0
MUTATION                     0
REQUEST_RESPONSE             0

So I'm guessing maybe the different schema versions may be potentially
stopping compactions? Will compactions still happen if there are different
versions of the schema?





On 9/18/12 11:38 AM, "Michael Kjellman" <mk...@barracuda.com> wrote:

>Thanks, I just modified the schema on the worse offending column family
>(as determined by the .json) from 10MB to 200MB.
>
>Should I kick off a compaction on this cf now/repair?/scrub?
>
>Thanks
>
>-michael
>
>From: Віталій Тимчишин <ti...@gmail.com>>
>Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>"
><us...@cassandra.apache.org>>
>To: "user@cassandra.apache.org<ma...@cassandra.apache.org>"
><us...@cassandra.apache.org>>
>Subject: Re: persistent compaction issue (1.1.4 and 1.1.5)
>
>I've started to use LeveledCompaction some time ago and from my
>experience this indicates some SST on lower levels than they should be.
>The compaction is going, moving them up level by level, but total count
>does not change as new data goes in.
>The numbers are pretty high as for me. Such numbers mean a lot of files
>(over 100K in single directory) and a lot of thinking for compaction
>executor to decide what to compact next. I can see numbers like 5K-10K
>and still thing this is high number. If I were you, I'd increase
>sstable_size_in_mb 10-20 times it is now.
>
>2012/9/17 Michael Kjellman
><mk...@barracuda.com>>
>Hi All,
>
>I have an issue where each one of my nodes (currently all running at
>1.1.5) is reporting around 30,000 pending compactions. I understand that
>a pending compaction doesn't necessarily mean it is a scheduled task
>however I'm confused why this behavior is occurring. It is the same on
>all nodes, occasionally goes down 5k pending compaction tasks, and then
>returns to 25,000-35,000 compaction tasks pending.
>
>I have tried a repair operation/scrub operation on two of the nodes and
>while compactions initially happen the number of pending compactions does
>not decrease.
>
>Any ideas? Thanks for your time.
>
>Best,
>michael
>
>
>'Like' us on Facebook for exclusive content and other resources on all
>Barracuda Networks solutions.
>
>Visit http://barracudanetworks.com/facebook
>
>
>
>
>
>
>
>--
>Best regards,
> Vitalii Tymchyshyn
>
>'Like' us on Facebook for exclusive content and other resources on all
>Barracuda Networks solutions.
>
>Visit http://barracudanetworks.com/facebook
>
>
>
>


'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks solutions.
Visit http://barracudanetworks.com/facebook



Re: persistent compaction issue (1.1.4 and 1.1.5)

Posted by Michael Kjellman <mk...@barracuda.com>.
Thanks, I just modified the schema on the worse offending column family (as determined by the .json) from 10MB to 200MB.

Should I kick off a compaction on this cf now/repair?/scrub?

Thanks

-michael

From: Віталій Тимчишин <ti...@gmail.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: persistent compaction issue (1.1.4 and 1.1.5)

I've started to use LeveledCompaction some time ago and from my experience this indicates some SST on lower levels than they should be. The compaction is going, moving them up level by level, but total count does not change as new data goes in.
The numbers are pretty high as for me. Such numbers mean a lot of files (over 100K in single directory) and a lot of thinking for compaction executor to decide what to compact next. I can see numbers like 5K-10K and still thing this is high number. If I were you, I'd increase sstable_size_in_mb 10-20 times it is now.

2012/9/17 Michael Kjellman <mk...@barracuda.com>>
Hi All,

I have an issue where each one of my nodes (currently all running at 1.1.5) is reporting around 30,000 pending compactions. I understand that a pending compaction doesn't necessarily mean it is a scheduled task however I'm confused why this behavior is occurring. It is the same on all nodes, occasionally goes down 5k pending compaction tasks, and then returns to 25,000-35,000 compaction tasks pending.

I have tried a repair operation/scrub operation on two of the nodes and while compactions initially happen the number of pending compactions does not decrease.

Any ideas? Thanks for your time.

Best,
michael


'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks solutions.

Visit http://barracudanetworks.com/facebook







--
Best regards,
 Vitalii Tymchyshyn

'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks solutions.
Visit http://barracudanetworks.com/facebook



Re: persistent compaction issue (1.1.4 and 1.1.5)

Posted by Віталій Тимчишин <ti...@gmail.com>.
I've started to use LeveledCompaction some time ago and from my experience
this indicates some SST on lower levels than they should be. The compaction
is going, moving them up level by level, but total count does not change as
new data goes in.
The numbers are pretty high as for me. Such numbers mean a lot of files
(over 100K in single directory) and a lot of thinking for compaction
executor to decide what to compact next. I can see numbers like 5K-10K and
still thing this is high number. If I were you, I'd increase sstable_size_in_mb
10-20 times it is now.

2012/9/17 Michael Kjellman <mk...@barracuda.com>

> Hi All,
>
> I have an issue where each one of my nodes (currently all running at
> 1.1.5) is reporting around 30,000 pending compactions. I understand that a
> pending compaction doesn't necessarily mean it is a scheduled task however
> I'm confused why this behavior is occurring. It is the same on all nodes,
> occasionally goes down 5k pending compaction tasks, and then returns to
> 25,000-35,000 compaction tasks pending.
>
> I have tried a repair operation/scrub operation on two of the nodes and
> while compactions initially happen the number of pending compactions does
> not decrease.
>
> Any ideas? Thanks for your time.
>
> Best,
> michael
>
>
> 'Like' us on Facebook for exclusive content and other resources on all
> Barracuda Networks solutions.
>
> Visit http://barracudanetworks.com/facebook
>
>
>
>
>


-- 
Best regards,
 Vitalii Tymchyshyn