You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Evgeny Inberg <ev...@gmail.com> on 2019/05/01 13:04:09 UTC

Cassandra taking very long to start and server under heavy load

I have upgraded a Cassandra cluster from version 2.0.x to 3.11.4 going
trough 2.1.14.
After the upgrade, noticed that each node is taking about 10-15 minutes to
start, and server is under a very heavy load.
Did some digging around and got view leads from the debug log.
Messages like:
*Keyspace.java:351 - New replication settings for keyspace system_auth -
invalidating disk boundary caches *
*CompactionStrategyManager.java:380 - Recreating compaction strategy - disk
boundaries are out of date for system_auth.roles.*

This is repeating for all keyspaces.

Any suggestion to check and what might cause this to happen on every start?

Thanks!

Re: Cassandra taking very long to start and server under heavy load

Posted by Carl Mueller <ca...@smartthings.com.INVALID>.

You may have encountered the same behavior we have encountered going from
2.1 --> 2.2 a week or so ago.

We also have multiple data dirs. Hmmmmm.

In our case, we will purge the data of the big offending table.

HOw big are your nodes?

On Tue, May 7, 2019 at 1:40 AM Evgeny Inberg <ev...@gmail.com> wrote:

> Still no resolution for this. Did anyone else encounter same behavior?
>
> On Thu, May 2, 2019 at 1:54 PM Evgeny Inberg <ev...@gmail.com> wrote:
>
>> Yes, sstable upgraded on each node.
>>
>> On Thu, 2 May 2019, 13:39 Nick Hatfield <ni...@metricly.com>
>> wrote:
>>
>>> Just curious but, did you make sure to run the sstable upgrade after you
>>> completed the move from 2.x to 3.x ?
>>>
>>>
>>>
>>> *From:* Evgeny Inberg [mailto:evginb@gmail.com]
>>> *Sent:* Thursday, May 02, 2019 1:31 AM
>>> *To:* user@cassandra.apache.org
>>> *Subject:* Re: Cassandra taking very long to start and server under
>>> heavy load
>>>
>>>
>>>
>>> Using a sigle data disk.
>>>
>>> Also, it is performing mostly heavy read operations according to the
>>> metrics cillected.
>>>
>>> On Wed, 1 May 2019, 20:14 Jeff Jirsa <jj...@gmail.com> wrote:
>>>
>>> Do you have multiple data disks?
>>>
>>> Cassandra 6696 changed behavior with multiple data disks to make it
>>> safer in the situation that one disk fails . It may be copying data to the
>>> right places on startup, can you see if sstables are being moved on disk?
>>>
>>> --
>>>
>>> Jeff Jirsa
>>>
>>>
>>>
>>>
>>> On May 1, 2019, at 6:04 AM, Evgeny Inberg <ev...@gmail.com> wrote:
>>>
>>> I have upgraded a Cassandra cluster from version 2.0.x to 3.11.4 going
>>> trough 2.1.14.
>>>
>>> After the upgrade, noticed that each node is taking about 10-15 minutes
>>> to start, and server is under a very heavy load.
>>>
>>> Did some digging around and got view leads from the debug log.
>>>
>>> Messages like:
>>>
>>> *Keyspace.java:351 - New replication settings for keyspace system_auth -
>>> invalidating disk boundary caches *
>>>
>>> *CompactionStrategyManager.java:380 - Recreating compaction strategy -
>>> disk boundaries are out of date for system_auth.roles.*
>>>
>>>
>>>
>>> This is repeating for all keyspaces.
>>>
>>>
>>>
>>> Any suggestion to check and what might cause this to happen on every
>>> start?
>>>
>>>
>>>
>>> Thanks!e
>>>
>>>

Re: Cassandra taking very long to start and server under heavy load

Posted by Evgeny Inberg <ev...@gmail.com>.

Still no resolution for this. Did anyone else encounter same behavior?

On Thu, May 2, 2019 at 1:54 PM Evgeny Inberg <ev...@gmail.com> wrote:

> Yes, sstable upgraded on each node.
>
> On Thu, 2 May 2019, 13:39 Nick Hatfield <ni...@metricly.com>
> wrote:
>
>> Just curious but, did you make sure to run the sstable upgrade after you
>> completed the move from 2.x to 3.x ?
>>
>>
>>
>> *From:* Evgeny Inberg [mailto:evginb@gmail.com]
>> *Sent:* Thursday, May 02, 2019 1:31 AM
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: Cassandra taking very long to start and server under
>> heavy load
>>
>>
>>
>> Using a sigle data disk.
>>
>> Also, it is performing mostly heavy read operations according to the
>> metrics cillected.
>>
>> On Wed, 1 May 2019, 20:14 Jeff Jirsa <jj...@gmail.com> wrote:
>>
>> Do you have multiple data disks?
>>
>> Cassandra 6696 changed behavior with multiple data disks to make it safer
>> in the situation that one disk fails . It may be copying data to the right
>> places on startup, can you see if sstables are being moved on disk?
>>
>> --
>>
>> Jeff Jirsa
>>
>>
>>
>>
>> On May 1, 2019, at 6:04 AM, Evgeny Inberg <ev...@gmail.com> wrote:
>>
>> I have upgraded a Cassandra cluster from version 2.0.x to 3.11.4 going
>> trough 2.1.14.
>>
>> After the upgrade, noticed that each node is taking about 10-15 minutes
>> to start, and server is under a very heavy load.
>>
>> Did some digging around and got view leads from the debug log.
>>
>> Messages like:
>>
>> *Keyspace.java:351 - New replication settings for keyspace system_auth -
>> invalidating disk boundary caches *
>>
>> *CompactionStrategyManager.java:380 - Recreating compaction strategy -
>> disk boundaries are out of date for system_auth.roles.*
>>
>>
>>
>> This is repeating for all keyspaces.
>>
>>
>>
>> Any suggestion to check and what might cause this to happen on every
>> start?
>>
>>
>>
>> Thanks!e
>>
>>

Re: Cassandra taking very long to start and server under heavy load

Posted by Evgeny Inberg <ev...@gmail.com>.

Yes, sstable upgraded on each node.

On Thu, 2 May 2019, 13:39 Nick Hatfield <ni...@metricly.com> wrote:

> Just curious but, did you make sure to run the sstable upgrade after you
> completed the move from 2.x to 3.x ?
>
>
>
> *From:* Evgeny Inberg [mailto:evginb@gmail.com]
> *Sent:* Thursday, May 02, 2019 1:31 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Cassandra taking very long to start and server under heavy
> load
>
>
>
> Using a sigle data disk.
>
> Also, it is performing mostly heavy read operations according to the
> metrics cillected.
>
> On Wed, 1 May 2019, 20:14 Jeff Jirsa <jj...@gmail.com> wrote:
>
> Do you have multiple data disks?
>
> Cassandra 6696 changed behavior with multiple data disks to make it safer
> in the situation that one disk fails . It may be copying data to the right
> places on startup, can you see if sstables are being moved on disk?
>
> --
>
> Jeff Jirsa
>
>
>
>
> On May 1, 2019, at 6:04 AM, Evgeny Inberg <ev...@gmail.com> wrote:
>
> I have upgraded a Cassandra cluster from version 2.0.x to 3.11.4 going
> trough 2.1.14.
>
> After the upgrade, noticed that each node is taking about 10-15 minutes to
> start, and server is under a very heavy load.
>
> Did some digging around and got view leads from the debug log.
>
> Messages like:
>
> *Keyspace.java:351 - New replication settings for keyspace system_auth -
> invalidating disk boundary caches *
>
> *CompactionStrategyManager.java:380 - Recreating compaction strategy -
> disk boundaries are out of date for system_auth.roles.*
>
>
>
> This is repeating for all keyspaces.
>
>
>
> Any suggestion to check and what might cause this to happen on every
> start?
>
>
>
> Thanks!e
>
>

RE: Cassandra taking very long to start and server under heavy load

Posted by Nick Hatfield <ni...@metricly.com>.

Just curious but, did you make sure to run the sstable upgrade after you completed the move from 2.x to 3.x ?

From: Evgeny Inberg [mailto:evginb@gmail.com]
Sent: Thursday, May 02, 2019 1:31 AM
To: user@cassandra.apache.org
Subject: Re: Cassandra taking very long to start and server under heavy load

Using a sigle data disk.
Also, it is performing mostly heavy read operations according to the metrics cillected.
On Wed, 1 May 2019, 20:14 Jeff Jirsa <jj...@gmail.com>> wrote:
Do you have multiple data disks?
Cassandra 6696 changed behavior with multiple data disks to make it safer in the situation that one disk fails . It may be copying data to the right places on startup, can you see if sstables are being moved on disk?
--
Jeff Jirsa

On May 1, 2019, at 6:04 AM, Evgeny Inberg <ev...@gmail.com>> wrote:
I have upgraded a Cassandra cluster from version 2.0.x to 3.11.4 going trough 2.1.14.
After the upgrade, noticed that each node is taking about 10-15 minutes to start, and server is under a very heavy load.
Did some digging around and got view leads from the debug log.
Messages like:
Keyspace.java:351 - New replication settings for keyspace system_auth - invalidating disk boundary caches
CompactionStrategyManager.java:380 - Recreating compaction strategy - disk boundaries are out of date for system_auth.roles.

This is repeating for all keyspaces.

Any suggestion to check and what might cause this to happen on every start?

Thanks!e

Re: Cassandra taking very long to start and server under heavy load

Posted by Evgeny Inberg <ev...@gmail.com>.

Using a sigle data disk.
Also, it is performing mostly heavy read operations according to the
metrics cillected.

On Wed, 1 May 2019, 20:14 Jeff Jirsa <jj...@gmail.com> wrote:

> Do you have multiple data disks?
> Cassandra 6696 changed behavior with multiple data disks to make it safer
> in the situation that one disk fails . It may be copying data to the right
> places on startup, can you see if sstables are being moved on disk?
>
> --
> Jeff Jirsa
>
>
> On May 1, 2019, at 6:04 AM, Evgeny Inberg <ev...@gmail.com> wrote:
>
> I have upgraded a Cassandra cluster from version 2.0.x to 3.11.4 going
> trough 2.1.14.
> After the upgrade, noticed that each node is taking about 10-15 minutes to
> start, and server is under a very heavy load.
> Did some digging around and got view leads from the debug log.
> Messages like:
> *Keyspace.java:351 - New replication settings for keyspace system_auth -
> invalidating disk boundary caches *
> *CompactionStrategyManager.java:380 - Recreating compaction strategy -
> disk boundaries are out of date for system_auth.roles.*
>
> This is repeating for all keyspaces.
>
> Any suggestion to check and what might cause this to happen on every
> start?
>
> Thanks!e
>
>

Re: Cassandra taking very long to start and server under heavy load

Posted by Jeff Jirsa <jj...@gmail.com>.

Do you have multiple data disks? 
Cassandra 6696 changed behavior with multiple data disks to make it safer in the situation that one disk fails . It may be copying data to the right places on startup, can you see if sstables are being moved on disk? 

-- 
Jeff Jirsa


> On May 1, 2019, at 6:04 AM, Evgeny Inberg <ev...@gmail.com> wrote:
> 
> I have upgraded a Cassandra cluster from version 2.0.x to 3.11.4 going trough 2.1.14. 
> After the upgrade, noticed that each node is taking about 10-15 minutes to start, and server is under a very heavy load.
> Did some digging around and got view leads from the debug log. 
> Messages like:
> Keyspace.java:351 - New replication settings for keyspace system_auth - invalidating disk boundary caches 
> CompactionStrategyManager.java:380 - Recreating compaction strategy - disk boundaries are out of date for system_auth.roles.
> 
> This is repeating for all keyspaces. 
> 
> Any suggestion to check and what might cause this to happen on every start? 
> 
> Thanks!