You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Daniel Subak <da...@klaviyo.com> on 2016/11/16 19:05:22 UTC

Cassandra Node Restart Stuck in STARTING?

Hey everyone,

Ran into an issue running a node restart where "nodetool netstats" reported
the node as "STARTING" with no streams when run locally. "nodetool status"
run on other nodes reported that node as "DN". Both of those were expected.
However, tailing the logs, there didn't seem to be anything noteworthy
happening (below are the last few log lines in system.log.) Has anyone seen
this behavior before? We'd love to be able to better monitor what is
happening during a restart if anyone has some information on what happens
during this phase! Happy to provide more info if needed, but even a high
level general explanation would provide some clarity

Thanks,
Dan

INFO  [main] 2016-11-16 18:07:55,907 ColumnFamilyStore.java:405 -
Initializing system_schema.keyspaces
INFO  [main] 2016-11-16 18:07:55,942 ColumnFamilyStore.java:405 -
Initializing system_schema.tables
INFO  [main] 2016-11-16 18:07:55,971 ColumnFamilyStore.java:405 -
Initializing system_schema.columns
INFO  [main] 2016-11-16 18:07:55,992 ColumnFamilyStore.java:405 -
Initializing system_schema.triggers
INFO  [main] 2016-11-16 18:07:56,010 ColumnFamilyStore.java:405 -
Initializing system_schema.dropped_columns
INFO  [main] 2016-11-16 18:07:56,026 ColumnFamilyStore.java:405 -
Initializing system_schema.views
INFO  [main] 2016-11-16 18:07:56,047 ColumnFamilyStore.java:405 -
Initializing system_schema.types
INFO  [main] 2016-11-16 18:07:56,066 ColumnFamilyStore.java:405 -
Initializing system_schema.functions
INFO  [main] 2016-11-16 18:07:56,081 ColumnFamilyStore.java:405 -
Initializing system_schema.aggregates
INFO  [main] 2016-11-16 18:07:56,093 ColumnFamilyStore.java:405 -
Initializing system_schema.indexes
INFO  [main] 2016-11-16 18:07:56,102 ViewManager.java:139 - Not submitting
build tasks for views in keyspace system_schema as storage service is not
initialized

Re: Cassandra Node Restart Stuck in STARTING?

Posted by Surbhi Gupta <su...@gmail.com>.
Attaching the system.log can give more details ...

On 16 November 2016 at 11:05, Daniel Subak <da...@klaviyo.com> wrote:

> Hey everyone,
>
> Ran into an issue running a node restart where "nodetool netstats"
> reported the node as "STARTING" with no streams when run locally. "nodetool
> status" run on other nodes reported that node as "DN". Both of those were
> expected. However, tailing the logs, there didn't seem to be anything
> noteworthy happening (below are the last few log lines in system.log.) Has
> anyone seen this behavior before? We'd love to be able to better monitor
> what is happening during a restart if anyone has some information on what
> happens during this phase! Happy to provide more info if needed, but even a
> high level general explanation would provide some clarity
>
> Thanks,
> Dan
>
> INFO  [main] 2016-11-16 18:07:55,907 ColumnFamilyStore.java:405 -
> Initializing system_schema.keyspaces
> INFO  [main] 2016-11-16 18:07:55,942 ColumnFamilyStore.java:405 -
> Initializing system_schema.tables
> INFO  [main] 2016-11-16 18:07:55,971 ColumnFamilyStore.java:405 -
> Initializing system_schema.columns
> INFO  [main] 2016-11-16 18:07:55,992 ColumnFamilyStore.java:405 -
> Initializing system_schema.triggers
> INFO  [main] 2016-11-16 18:07:56,010 ColumnFamilyStore.java:405 -
> Initializing system_schema.dropped_columns
> INFO  [main] 2016-11-16 18:07:56,026 ColumnFamilyStore.java:405 -
> Initializing system_schema.views
> INFO  [main] 2016-11-16 18:07:56,047 ColumnFamilyStore.java:405 -
> Initializing system_schema.types
> INFO  [main] 2016-11-16 18:07:56,066 ColumnFamilyStore.java:405 -
> Initializing system_schema.functions
> INFO  [main] 2016-11-16 18:07:56,081 ColumnFamilyStore.java:405 -
> Initializing system_schema.aggregates
> INFO  [main] 2016-11-16 18:07:56,093 ColumnFamilyStore.java:405 -
> Initializing system_schema.indexes
> INFO  [main] 2016-11-16 18:07:56,102 ViewManager.java:139 - Not submitting
> build tasks for views in keyspace system_schema as storage service is not
> initialized
>

Re: Cassandra Node Restart Stuck in STARTING?

Posted by Daniel Subak <da...@klaviyo.com>.
We're on Cassandra 3.7, running on Ubuntu 14.04.
In terms of system utilization, we saw one Cassandra process which was
using 100% CPU, but overall load was very low on the box. Disk utilization
was largely nominal.

On Wed, Nov 16, 2016 at 2:19 PM, Jeff Jirsa <je...@crowdstrike.com>
wrote:

> What version?
>
> Is the system doing anything (do you see high CPU / disk usage)?
>
>
>
> Sometimes restarts will trigger some changes to files on disk that are
> mostly invisible in the logs (https://issues.apache.org/
> jira/browse/CASSANDRA-11163 for example), but it’s usually during a
> different part of the startup process (you’d be seeing different log
> messages), and would eventually complete.
>
>
>
>
>
>
>
> *From: *Daniel Subak <da...@klaviyo.com>
> *Reply-To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Date: *Wednesday, November 16, 2016 at 11:05 AM
> *To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Subject: *Cassandra Node Restart Stuck in STARTING?
>
>
>
> Hey everyone,
>
> Ran into an issue running a node restart where "nodetool netstats"
> reported the node as "STARTING" with no streams when run locally. "nodetool
> status" run on other nodes reported that node as "DN". Both of those were
> expected. However, tailing the logs, there didn't seem to be anything
> noteworthy happening (below are the last few log lines in system.log.) Has
> anyone seen this behavior before? We'd love to be able to better monitor
> what is happening during a restart if anyone has some information on what
> happens during this phase! Happy to provide more info if needed, but even a
> high level general explanation would provide some clarity
>
> Thanks,
>
> Dan
>
>
> INFO  [main] 2016-11-16 18:07:55,907 ColumnFamilyStore.java:405 -
> Initializing system_schema.keyspaces
> INFO  [main] 2016-11-16 18:07:55,942 ColumnFamilyStore.java:405 -
> Initializing system_schema.tables
> INFO  [main] 2016-11-16 18:07:55,971 ColumnFamilyStore.java:405 -
> Initializing system_schema.columns
> INFO  [main] 2016-11-16 18:07:55,992 ColumnFamilyStore.java:405 -
> Initializing system_schema.triggers
> INFO  [main] 2016-11-16 18:07:56,010 ColumnFamilyStore.java:405 -
> Initializing system_schema.dropped_columns
> INFO  [main] 2016-11-16 18:07:56,026 ColumnFamilyStore.java:405 -
> Initializing system_schema.views
> INFO  [main] 2016-11-16 18:07:56,047 ColumnFamilyStore.java:405 -
> Initializing system_schema.types
> INFO  [main] 2016-11-16 18:07:56,066 ColumnFamilyStore.java:405 -
> Initializing system_schema.functions
> INFO  [main] 2016-11-16 18:07:56,081 ColumnFamilyStore.java:405 -
> Initializing system_schema.aggregates
> INFO  [main] 2016-11-16 18:07:56,093 ColumnFamilyStore.java:405 -
> Initializing system_schema.indexes
> INFO  [main] 2016-11-16 18:07:56,102 ViewManager.java:139 - Not submitting
> build tasks for views in keyspace system_schema as storage service is not
> initialized
>

Re: Cassandra Node Restart Stuck in STARTING?

Posted by Jeff Jirsa <je...@crowdstrike.com>.
What version? 

Is the system doing anything (do you see high CPU / disk usage)?

 

Sometimes restarts will trigger some changes to files on disk that are mostly invisible in the logs (https://issues.apache.org/jira/browse/CASSANDRA-11163 for example), but it’s usually during a different part of the startup process (you’d be seeing different log messages), and would eventually complete. 

 

 

 

From: Daniel Subak <da...@klaviyo.com>
Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Date: Wednesday, November 16, 2016 at 11:05 AM
To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Subject: Cassandra Node Restart Stuck in STARTING?

 

Hey everyone,

Ran into an issue running a node restart where "nodetool netstats" reported the node as "STARTING" with no streams when run locally. "nodetool status" run on other nodes reported that node as "DN". Both of those were expected. However, tailing the logs, there didn't seem to be anything noteworthy happening (below are the last few log lines in system.log.) Has anyone seen this behavior before? We'd love to be able to better monitor what is happening during a restart if anyone has some information on what happens during this phase! Happy to provide more info if needed, but even a high level general explanation would provide some clarity

Thanks,

Dan


INFO  [main] 2016-11-16 18:07:55,907 ColumnFamilyStore.java:405 - Initializing system_schema.keyspaces
INFO  [main] 2016-11-16 18:07:55,942 ColumnFamilyStore.java:405 - Initializing system_schema.tables
INFO  [main] 2016-11-16 18:07:55,971 ColumnFamilyStore.java:405 - Initializing system_schema.columns
INFO  [main] 2016-11-16 18:07:55,992 ColumnFamilyStore.java:405 - Initializing system_schema.triggers
INFO  [main] 2016-11-16 18:07:56,010 ColumnFamilyStore.java:405 - Initializing system_schema.dropped_columns
INFO  [main] 2016-11-16 18:07:56,026 ColumnFamilyStore.java:405 - Initializing system_schema.views
INFO  [main] 2016-11-16 18:07:56,047 ColumnFamilyStore.java:405 - Initializing system_schema.types
INFO  [main] 2016-11-16 18:07:56,066 ColumnFamilyStore.java:405 - Initializing system_schema.functions
INFO  [main] 2016-11-16 18:07:56,081 ColumnFamilyStore.java:405 - Initializing system_schema.aggregates
INFO  [main] 2016-11-16 18:07:56,093 ColumnFamilyStore.java:405 - Initializing system_schema.indexes
INFO  [main] 2016-11-16 18:07:56,102 ViewManager.java:139 - Not submitting build tasks for views in keyspace system_schema as storage service is not initialized