You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jai Bheemsen Rao Dhanwada <ja...@gmail.com> on 2020/06/01 19:51:46 UTC

Cassandra Bootstrap Sequence

Hello Team,

When I am bootstrapping/restarting a Cassandra Node, there is a delay
between gossip settle and port opening. Can someone please explain me where
this delay is configured and can this be changed? I don't see any
information in the logs

In my case if you see there is  a ~3 minutes delay and this increases if I
increase the #of tables and #of nodes and DC.

INFO  [main] 2020-05-31 23:51:07,554 Gossiper.java:1692 - Waiting for
> gossip to settle...
>
> *INFO  [main] 2020-05-31 23:51:15,555 Gossiper.java:1723 - No gossip
> backlog; proceeding*INFO  [main] 2020-05-31 23:54:06,867
> NativeTransportService.java:70 - Netty using native Epoll event loop
> INFO  [main] 2020-05-31 23:54:06,913 Server.java:155 - Using Netty
> Version: [netty-buffer=netty-buffer-4.0.44.Final.452812a,
> netty-codec=netty-codec-4.0.44.Final.452812a,
> netty-codec-haproxy=netty-codec-haproxy-4.0.44.Final.452812a,
> netty-codec-http=netty-codec-http-4.0.44.Final.452812a,
> netty-codec-socks=netty-codec-socks-4.0.44.Final.452812a,
> netty-common=netty-common-4.0.44.Final.452812a,
> netty-handler=netty-handler-4.0.44.Final.452812a,
> netty-tcnative=netty-tcnative-1.1.33.Fork26.142ecbb,
> netty-transport=netty-transport-4.0.44.Final.452812a,
> netty-transport-native-epoll=netty-transport-native-epoll-4.0.44.Final.452812a,
> netty-transport-rxtx=netty-transport-rxtx-4.0.44.Final.452812a,
> netty-transport-sctp=netty-transport-sctp-4.0.44.Final.452812a,
> netty-transport-udt=netty-transport-udt-4.0.44.Final.452812a]
> *INFO  [main] 2020-05-31 23:54:06,913 Server.java:156 - Starting listening
> for CQL clients on /x.x.x.x:9042 (encrypted)...*


Also during this 3 minutes delay, I am losing all my metrics from the C*
nodes(basically the metrics are not returned within 10s).

Can someone please help me understand the delay here?

Cassandra Version: 3.11.3
Metrics: Using telegraf to collect metrics.

Re: Cassandra Bootstrap Sequence

Posted by Reid Pinchback <rp...@tripadvisor.com>.
Would updating disk boundaries be sensitive to disk I/O tuning?  I’m remembering Jon Haddad’s talk about typical throughput problems in disk page sizing.

From: Jai Bheemsen Rao Dhanwada <ja...@gmail.com>
Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Date: Tuesday, June 2, 2020 at 10:48 AM
To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Subject: Re: Cassandra Bootstrap Sequence

Message from External Sender
3000 tables

On Tuesday, June 2, 2020, Durity, Sean R <SE...@homedepot.com>> wrote:
How many total tables in the cluster?


Sean Durity

From: Jai Bheemsen Rao Dhanwada <ja...@gmail.com>>
Sent: Monday, June 1, 2020 8:36 PM
To: user@cassandra.apache.org<ma...@cassandra.apache.org>
Subject: [EXTERNAL] Re: Cassandra Bootstrap Sequence

Thanks Erick,

I see below tasks are being run mostly. I didn't quite understand what exactly these scheduled tasks are for? Is there a way to reduce the boot-up time or do I have to live with this delay?

$ zgrep "CompactionStrategyManager.java:380 - Recreating compaction strategy" debug.log*  | wc -l
3249
$ zgrep "DiskBoundaryManager.java:53 - Refreshing disk boundary cache for" debug.log*  | wc -l
6293
$ zgrep "DiskBoundaryManager.java:92 - Got local ranges" debug.log*  | wc -l
6308
$ zgrep "DiskBoundaryManager.java:56 - Updating boundaries from DiskBoundaries" debug.log*  | wc -l
3249





On Mon, Jun 1, 2020 at 5:01 PM Erick Ramirez <er...@datastax.com>> wrote:
There's quite a lot of steps that takes place during the startup sequence between these 2 lines:

INFO  [main] 2020-05-31 23:51:15,555 Gossiper.java:1723 - No gossip backlog; proceeding
INFO  [main] 2020-05-31 23:54:06,867 NativeTransportService.java:70 - Netty using native Epoll event loop

For the most part, it's taken up by CompactionStrategyManager and DiskBoundaryManager. If you check debug.log, you'll see that it's mostly updating disk boundaries. The length of time it takes is proportional to the number of tables in the cluster.

Have a look at this section [1] of CassandraDaemon if you're interested in the details of the startup sequence. Cheers!

[1] https://github.com/apache/cassandra/blob/cassandra-3.11.3/src/java/org/apache/cassandra/service/CassandraDaemon.java#L399-L435 [github.com]<https://urldefense.com/v3/__https:/github.com/apache/cassandra/blob/cassandra-3.11.3/src/java/org/apache/cassandra/service/CassandraDaemon.java*L399-L435__;Iw!!M-nmYVHPHQ!dt_R3xGLIK4vc3FdekacgZnl6PDJVAqW_c-yBaIAmQsoVKp7SoW7VeM3gc7VSLx2KgcKBSE$>

________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.

Re: Cassandra Bootstrap Sequence

Posted by Jai Bheemsen Rao Dhanwada <ja...@gmail.com>.
Just did some more debugging it looks like the "nodetool compactionstats"
which is hung/taking time during this period causing the delay in metrics.
I still puzzled why the nodetool compactionstats commands takes longer on
all the nodes at the same time, when one node is being restarted

$ time nodetool compactionstats
> pending tasks: 0
>
> real 1m17.559s
> user 0m2.340s
> sys 0m0.248s


On Tue, Jun 2, 2020 at 10:25 AM Jai Bheemsen Rao Dhanwada <
jaibheemsen@gmail.com> wrote:

> Also during this time, I am losing metrics for all the nodes in the
> cluster (metrics agent is timing out collecting within 10s) and recovers
> once the node starts the CQL port. Is there any known issue which could
> cause this? In my case the delay between Gossip settle and CQL port open is
> 3 minutes, metrics were lost for all the nodes during the 3 minute period.
>
> On Tue, Jun 2, 2020 at 7:55 AM Jai Bheemsen Rao Dhanwada <
> jaibheemsen@gmail.com> wrote:
>
>> Thank you,
>>
>> Does that mean there is no way to improve this delay? And i have to live
>> with it since i have more tables?
>>
>> On Tuesday, June 2, 2020, Durity, Sean R <SE...@homedepot.com>
>> wrote:
>>
>>> As I understand it, Cassandra clusters should be limited to a number of
>>> tables in the low hundreds (under 200), at most. What you are seeing is the
>>> carving up of memtables for each of those 3,000. I try to limit my clusters
>>> to roughly 100 tables.
>>>
>>>
>>>
>>>
>>>
>>> Sean Durity
>>>
>>>
>>>
>>> *From:* Jai Bheemsen Rao Dhanwada <ja...@gmail.com>
>>> *Sent:* Tuesday, June 2, 2020 10:48 AM
>>> *To:* user@cassandra.apache.org
>>> *Subject:* [EXTERNAL] Re: Cassandra Bootstrap Sequence
>>>
>>>
>>>
>>> 3000 tables
>>>
>>> On Tuesday, June 2, 2020, Durity, Sean R <SE...@homedepot.com>
>>> wrote:
>>>
>>> How many total tables in the cluster?
>>>
>>>
>>>
>>>
>>>
>>> Sean Durity
>>>
>>>
>>>
>>> *From:* Jai Bheemsen Rao Dhanwada <ja...@gmail.com>
>>> *Sent:* Monday, June 1, 2020 8:36 PM
>>> *To:* user@cassandra.apache.org
>>> *Subject:* [EXTERNAL] Re: Cassandra Bootstrap Sequence
>>>
>>>
>>>
>>> Thanks Erick,
>>>
>>>
>>>
>>> I see below tasks are being run mostly. I didn't quite understand what
>>> exactly these scheduled tasks are for? Is there a way to reduce the boot-up
>>> time or do I have to live with this delay?
>>>
>>>
>>>
>>> $ zgrep "CompactionStrategyManager.java:380 - Recreating compaction
>>> strategy" debug.log*  | wc -l
>>> 3249
>>> $ zgrep "DiskBoundaryManager.java:53 - Refreshing disk boundary cache
>>> for" debug.log*  | wc -l
>>> 6293
>>> $ zgrep "DiskBoundaryManager.java:92 - Got local ranges" debug.log*  |
>>> wc -l
>>> 6308
>>> $ zgrep "DiskBoundaryManager.java:56 - Updating boundaries from
>>> DiskBoundaries" debug.log*  | wc -l
>>> 3249
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Jun 1, 2020 at 5:01 PM Erick Ramirez <er...@datastax.com>
>>> wrote:
>>>
>>> There's quite a lot of steps that takes place during the startup
>>> sequence between these 2 lines:
>>>
>>>
>>>
>>>
>>> *INFO  [main] 2020-05-31 23:51:15,555 Gossiper.java:1723 - No gossip
>>> backlog; proceeding *INFO  [main] 2020-05-31 23:54:06,867
>>> NativeTransportService.java:70 - Netty using native Epoll event loop
>>>
>>>
>>>
>>> For the most part, it's taken up by CompactionStrategyManager and
>>> DiskBoundaryManager. If you check debug.log, you'll see that it's
>>> mostly updating disk boundaries. The length of time it takes is
>>> proportional to the number of tables in the cluster.
>>>
>>>
>>>
>>> Have a look at this section [1] of CassandraDaemon if you're interested
>>> in the details of the startup sequence. Cheers!
>>>
>>>
>>>
>>> [1] https://github.com/apache/cassandra/blob/cassandra-3.11.3/src/java/org/apache/cassandra/service/CassandraDaemon.java#L399-L435
>>> [github.com]
>>> <https://urldefense.com/v3/__https:/github.com/apache/cassandra/blob/cassandra-3.11.3/src/java/org/apache/cassandra/service/CassandraDaemon.java*L399-L435__;Iw!!M-nmYVHPHQ!dt_R3xGLIK4vc3FdekacgZnl6PDJVAqW_c-yBaIAmQsoVKp7SoW7VeM3gc7VSLx2KgcKBSE$>
>>>
>>>
>>> ------------------------------
>>>
>>>
>>> The information in this Internet Email is confidential and may be
>>> legally privileged. It is intended solely for the addressee. Access to this
>>> Email by anyone else is unauthorized. If you are not the intended
>>> recipient, any disclosure, copying, distribution or any action taken or
>>> omitted to be taken in reliance on it, is prohibited and may be unlawful.
>>> When addressed to our clients any opinions or advice contained in this
>>> Email are subject to the terms and conditions expressed in any applicable
>>> governing The Home Depot terms of business or client engagement letter. The
>>> Home Depot disclaims all responsibility and liability for the accuracy and
>>> content of this attachment and for any damages or losses arising from any
>>> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
>>> items of a destructive nature, which may be contained in this attachment
>>> and shall not be liable for direct, indirect, consequential or special
>>> damages in connection with this e-mail message or its attachment.
>>>
>>>
>>> ------------------------------
>>>
>>> The information in this Internet Email is confidential and may be
>>> legally privileged. It is intended solely for the addressee. Access to this
>>> Email by anyone else is unauthorized. If you are not the intended
>>> recipient, any disclosure, copying, distribution or any action taken or
>>> omitted to be taken in reliance on it, is prohibited and may be unlawful.
>>> When addressed to our clients any opinions or advice contained in this
>>> Email are subject to the terms and conditions expressed in any applicable
>>> governing The Home Depot terms of business or client engagement letter. The
>>> Home Depot disclaims all responsibility and liability for the accuracy and
>>> content of this attachment and for any damages or losses arising from any
>>> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
>>> items of a destructive nature, which may be contained in this attachment
>>> and shall not be liable for direct, indirect, consequential or special
>>> damages in connection with this e-mail message or its attachment.
>>>
>>

Re: Cassandra Bootstrap Sequence

Posted by Jai Bheemsen Rao Dhanwada <ja...@gmail.com>.
Also during this time, I am losing metrics for all the nodes in the cluster
(metrics agent is timing out collecting within 10s) and recovers once the
node starts the CQL port. Is there any known issue which could cause this?
In my case the delay between Gossip settle and CQL port open is 3 minutes,
metrics were lost for all the nodes during the 3 minute period.

On Tue, Jun 2, 2020 at 7:55 AM Jai Bheemsen Rao Dhanwada <
jaibheemsen@gmail.com> wrote:

> Thank you,
>
> Does that mean there is no way to improve this delay? And i have to live
> with it since i have more tables?
>
> On Tuesday, June 2, 2020, Durity, Sean R <SE...@homedepot.com>
> wrote:
>
>> As I understand it, Cassandra clusters should be limited to a number of
>> tables in the low hundreds (under 200), at most. What you are seeing is the
>> carving up of memtables for each of those 3,000. I try to limit my clusters
>> to roughly 100 tables.
>>
>>
>>
>>
>>
>> Sean Durity
>>
>>
>>
>> *From:* Jai Bheemsen Rao Dhanwada <ja...@gmail.com>
>> *Sent:* Tuesday, June 2, 2020 10:48 AM
>> *To:* user@cassandra.apache.org
>> *Subject:* [EXTERNAL] Re: Cassandra Bootstrap Sequence
>>
>>
>>
>> 3000 tables
>>
>> On Tuesday, June 2, 2020, Durity, Sean R <SE...@homedepot.com>
>> wrote:
>>
>> How many total tables in the cluster?
>>
>>
>>
>>
>>
>> Sean Durity
>>
>>
>>
>> *From:* Jai Bheemsen Rao Dhanwada <ja...@gmail.com>
>> *Sent:* Monday, June 1, 2020 8:36 PM
>> *To:* user@cassandra.apache.org
>> *Subject:* [EXTERNAL] Re: Cassandra Bootstrap Sequence
>>
>>
>>
>> Thanks Erick,
>>
>>
>>
>> I see below tasks are being run mostly. I didn't quite understand what
>> exactly these scheduled tasks are for? Is there a way to reduce the boot-up
>> time or do I have to live with this delay?
>>
>>
>>
>> $ zgrep "CompactionStrategyManager.java:380 - Recreating compaction
>> strategy" debug.log*  | wc -l
>> 3249
>> $ zgrep "DiskBoundaryManager.java:53 - Refreshing disk boundary cache
>> for" debug.log*  | wc -l
>> 6293
>> $ zgrep "DiskBoundaryManager.java:92 - Got local ranges" debug.log*  | wc
>> -l
>> 6308
>> $ zgrep "DiskBoundaryManager.java:56 - Updating boundaries from
>> DiskBoundaries" debug.log*  | wc -l
>> 3249
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Mon, Jun 1, 2020 at 5:01 PM Erick Ramirez <er...@datastax.com>
>> wrote:
>>
>> There's quite a lot of steps that takes place during the startup sequence
>> between these 2 lines:
>>
>>
>>
>>
>> *INFO  [main] 2020-05-31 23:51:15,555 Gossiper.java:1723 - No gossip
>> backlog; proceeding *INFO  [main] 2020-05-31 23:54:06,867
>> NativeTransportService.java:70 - Netty using native Epoll event loop
>>
>>
>>
>> For the most part, it's taken up by CompactionStrategyManager and
>> DiskBoundaryManager. If you check debug.log, you'll see that it's mostly
>> updating disk boundaries. The length of time it takes is proportional to
>> the number of tables in the cluster.
>>
>>
>>
>> Have a look at this section [1] of CassandraDaemon if you're interested
>> in the details of the startup sequence. Cheers!
>>
>>
>>
>> [1] https://github.com/apache/cassandra/blob/cassandra-3.11.3/src/java/org/apache/cassandra/service/CassandraDaemon.java#L399-L435
>> [github.com]
>> <https://urldefense.com/v3/__https:/github.com/apache/cassandra/blob/cassandra-3.11.3/src/java/org/apache/cassandra/service/CassandraDaemon.java*L399-L435__;Iw!!M-nmYVHPHQ!dt_R3xGLIK4vc3FdekacgZnl6PDJVAqW_c-yBaIAmQsoVKp7SoW7VeM3gc7VSLx2KgcKBSE$>
>>
>>
>> ------------------------------
>>
>>
>> The information in this Internet Email is confidential and may be legally
>> privileged. It is intended solely for the addressee. Access to this Email
>> by anyone else is unauthorized. If you are not the intended recipient, any
>> disclosure, copying, distribution or any action taken or omitted to be
>> taken in reliance on it, is prohibited and may be unlawful. When addressed
>> to our clients any opinions or advice contained in this Email are subject
>> to the terms and conditions expressed in any applicable governing The Home
>> Depot terms of business or client engagement letter. The Home Depot
>> disclaims all responsibility and liability for the accuracy and content of
>> this attachment and for any damages or losses arising from any
>> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
>> items of a destructive nature, which may be contained in this attachment
>> and shall not be liable for direct, indirect, consequential or special
>> damages in connection with this e-mail message or its attachment.
>>
>>
>> ------------------------------
>>
>> The information in this Internet Email is confidential and may be legally
>> privileged. It is intended solely for the addressee. Access to this Email
>> by anyone else is unauthorized. If you are not the intended recipient, any
>> disclosure, copying, distribution or any action taken or omitted to be
>> taken in reliance on it, is prohibited and may be unlawful. When addressed
>> to our clients any opinions or advice contained in this Email are subject
>> to the terms and conditions expressed in any applicable governing The Home
>> Depot terms of business or client engagement letter. The Home Depot
>> disclaims all responsibility and liability for the accuracy and content of
>> this attachment and for any damages or losses arising from any
>> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
>> items of a destructive nature, which may be contained in this attachment
>> and shall not be liable for direct, indirect, consequential or special
>> damages in connection with this e-mail message or its attachment.
>>
>

Re: Cassandra Bootstrap Sequence

Posted by Jai Bheemsen Rao Dhanwada <ja...@gmail.com>.
Thank you,

Does that mean there is no way to improve this delay? And i have to live
with it since i have more tables?

On Tuesday, June 2, 2020, Durity, Sean R <SE...@homedepot.com>
wrote:

> As I understand it, Cassandra clusters should be limited to a number of
> tables in the low hundreds (under 200), at most. What you are seeing is the
> carving up of memtables for each of those 3,000. I try to limit my clusters
> to roughly 100 tables.
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* Jai Bheemsen Rao Dhanwada <ja...@gmail.com>
> *Sent:* Tuesday, June 2, 2020 10:48 AM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] Re: Cassandra Bootstrap Sequence
>
>
>
> 3000 tables
>
> On Tuesday, June 2, 2020, Durity, Sean R <SE...@homedepot.com>
> wrote:
>
> How many total tables in the cluster?
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* Jai Bheemsen Rao Dhanwada <ja...@gmail.com>
> *Sent:* Monday, June 1, 2020 8:36 PM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] Re: Cassandra Bootstrap Sequence
>
>
>
> Thanks Erick,
>
>
>
> I see below tasks are being run mostly. I didn't quite understand what
> exactly these scheduled tasks are for? Is there a way to reduce the boot-up
> time or do I have to live with this delay?
>
>
>
> $ zgrep "CompactionStrategyManager.java:380 - Recreating compaction
> strategy" debug.log*  | wc -l
> 3249
> $ zgrep "DiskBoundaryManager.java:53 - Refreshing disk boundary cache for"
> debug.log*  | wc -l
> 6293
> $ zgrep "DiskBoundaryManager.java:92 - Got local ranges" debug.log*  | wc
> -l
> 6308
> $ zgrep "DiskBoundaryManager.java:56 - Updating boundaries from
> DiskBoundaries" debug.log*  | wc -l
> 3249
>
>
>
>
>
>
>
>
>
>
>
> On Mon, Jun 1, 2020 at 5:01 PM Erick Ramirez <er...@datastax.com>
> wrote:
>
> There's quite a lot of steps that takes place during the startup sequence
> between these 2 lines:
>
>
>
>
> *INFO  [main] 2020-05-31 23:51:15,555 Gossiper.java:1723 - No gossip
> backlog; proceeding *INFO  [main] 2020-05-31 23:54:06,867
> NativeTransportService.java:70 - Netty using native Epoll event loop
>
>
>
> For the most part, it's taken up by CompactionStrategyManager and
> DiskBoundaryManager. If you check debug.log, you'll see that it's mostly
> updating disk boundaries. The length of time it takes is proportional to
> the number of tables in the cluster.
>
>
>
> Have a look at this section [1] of CassandraDaemon if you're interested
> in the details of the startup sequence. Cheers!
>
>
>
> [1] https://github.com/apache/cassandra/blob/cassandra-3.11.
> 3/src/java/org/apache/cassandra/service/CassandraDaemon.java#L399-L435
> [github.com]
> <https://urldefense.com/v3/__https:/github.com/apache/cassandra/blob/cassandra-3.11.3/src/java/org/apache/cassandra/service/CassandraDaemon.java*L399-L435__;Iw!!M-nmYVHPHQ!dt_R3xGLIK4vc3FdekacgZnl6PDJVAqW_c-yBaIAmQsoVKp7SoW7VeM3gc7VSLx2KgcKBSE$>
>
>
> ------------------------------
>
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>
>
> ------------------------------
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>

RE: Cassandra Bootstrap Sequence

Posted by "Durity, Sean R" <SE...@homedepot.com>.
As I understand it, Cassandra clusters should be limited to a number of tables in the low hundreds (under 200), at most. What you are seeing is the carving up of memtables for each of those 3,000. I try to limit my clusters to roughly 100 tables.


Sean Durity

From: Jai Bheemsen Rao Dhanwada <ja...@gmail.com>
Sent: Tuesday, June 2, 2020 10:48 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Cassandra Bootstrap Sequence

3000 tables

On Tuesday, June 2, 2020, Durity, Sean R <SE...@homedepot.com>> wrote:
How many total tables in the cluster?


Sean Durity

From: Jai Bheemsen Rao Dhanwada <ja...@gmail.com>>
Sent: Monday, June 1, 2020 8:36 PM
To: user@cassandra.apache.org<ma...@cassandra.apache.org>
Subject: [EXTERNAL] Re: Cassandra Bootstrap Sequence

Thanks Erick,

I see below tasks are being run mostly. I didn't quite understand what exactly these scheduled tasks are for? Is there a way to reduce the boot-up time or do I have to live with this delay?

$ zgrep "CompactionStrategyManager.java:380 - Recreating compaction strategy" debug.log*  | wc -l
3249
$ zgrep "DiskBoundaryManager.java:53 - Refreshing disk boundary cache for" debug.log*  | wc -l
6293
$ zgrep "DiskBoundaryManager.java:92 - Got local ranges" debug.log*  | wc -l
6308
$ zgrep "DiskBoundaryManager.java:56 - Updating boundaries from DiskBoundaries" debug.log*  | wc -l
3249





On Mon, Jun 1, 2020 at 5:01 PM Erick Ramirez <er...@datastax.com>> wrote:
There's quite a lot of steps that takes place during the startup sequence between these 2 lines:

INFO  [main] 2020-05-31 23:51:15,555 Gossiper.java:1723 - No gossip backlog; proceeding
INFO  [main] 2020-05-31 23:54:06,867 NativeTransportService.java:70 - Netty using native Epoll event loop

For the most part, it's taken up by CompactionStrategyManager and DiskBoundaryManager. If you check debug.log, you'll see that it's mostly updating disk boundaries. The length of time it takes is proportional to the number of tables in the cluster.

Have a look at this section [1] of CassandraDaemon if you're interested in the details of the startup sequence. Cheers!

[1] https://github.com/apache/cassandra/blob/cassandra-3.11.3/src/java/org/apache/cassandra/service/CassandraDaemon.java#L399-L435 [github.com]<https://urldefense.com/v3/__https:/github.com/apache/cassandra/blob/cassandra-3.11.3/src/java/org/apache/cassandra/service/CassandraDaemon.java*L399-L435__;Iw!!M-nmYVHPHQ!dt_R3xGLIK4vc3FdekacgZnl6PDJVAqW_c-yBaIAmQsoVKp7SoW7VeM3gc7VSLx2KgcKBSE$>

________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.

________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.

Re: Cassandra Bootstrap Sequence

Posted by Jai Bheemsen Rao Dhanwada <ja...@gmail.com>.
3000 tables

On Tuesday, June 2, 2020, Durity, Sean R <SE...@homedepot.com>
wrote:

> How many total tables in the cluster?
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* Jai Bheemsen Rao Dhanwada <ja...@gmail.com>
> *Sent:* Monday, June 1, 2020 8:36 PM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] Re: Cassandra Bootstrap Sequence
>
>
>
> Thanks Erick,
>
>
>
> I see below tasks are being run mostly. I didn't quite understand what
> exactly these scheduled tasks are for? Is there a way to reduce the boot-up
> time or do I have to live with this delay?
>
>
>
> $ zgrep "CompactionStrategyManager.java:380 - Recreating compaction
> strategy" debug.log*  | wc -l
> 3249
> $ zgrep "DiskBoundaryManager.java:53 - Refreshing disk boundary cache for"
> debug.log*  | wc -l
> 6293
> $ zgrep "DiskBoundaryManager.java:92 - Got local ranges" debug.log*  | wc
> -l
> 6308
> $ zgrep "DiskBoundaryManager.java:56 - Updating boundaries from
> DiskBoundaries" debug.log*  | wc -l
> 3249
>
>
>
>
>
>
>
>
>
>
>
> On Mon, Jun 1, 2020 at 5:01 PM Erick Ramirez <er...@datastax.com>
> wrote:
>
> There's quite a lot of steps that takes place during the startup sequence
> between these 2 lines:
>
>
>
>
> *INFO  [main] 2020-05-31 23:51:15,555 Gossiper.java:1723 - No gossip
> backlog; proceeding *INFO  [main] 2020-05-31 23:54:06,867
> NativeTransportService.java:70 - Netty using native Epoll event loop
>
>
>
> For the most part, it's taken up by CompactionStrategyManager and
> DiskBoundaryManager. If you check debug.log, you'll see that it's mostly
> updating disk boundaries. The length of time it takes is proportional to
> the number of tables in the cluster.
>
>
>
> Have a look at this section [1] of CassandraDaemon if you're interested
> in the details of the startup sequence. Cheers!
>
>
>
> [1] https://github.com/apache/cassandra/blob/cassandra-3.11.
> 3/src/java/org/apache/cassandra/service/CassandraDaemon.java#L399-L435
> [github.com]
> <https://urldefense.com/v3/__https:/github.com/apache/cassandra/blob/cassandra-3.11.3/src/java/org/apache/cassandra/service/CassandraDaemon.java*L399-L435__;Iw!!M-nmYVHPHQ!dt_R3xGLIK4vc3FdekacgZnl6PDJVAqW_c-yBaIAmQsoVKp7SoW7VeM3gc7VSLx2KgcKBSE$>
>
>
> ------------------------------
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>

RE: Cassandra Bootstrap Sequence

Posted by "Durity, Sean R" <SE...@homedepot.com>.
How many total tables in the cluster?


Sean Durity

From: Jai Bheemsen Rao Dhanwada <ja...@gmail.com>
Sent: Monday, June 1, 2020 8:36 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Cassandra Bootstrap Sequence

Thanks Erick,

I see below tasks are being run mostly. I didn't quite understand what exactly these scheduled tasks are for? Is there a way to reduce the boot-up time or do I have to live with this delay?

$ zgrep "CompactionStrategyManager.java:380 - Recreating compaction strategy" debug.log*  | wc -l
3249
$ zgrep "DiskBoundaryManager.java:53 - Refreshing disk boundary cache for" debug.log*  | wc -l
6293
$ zgrep "DiskBoundaryManager.java:92 - Got local ranges" debug.log*  | wc -l
6308
$ zgrep "DiskBoundaryManager.java:56 - Updating boundaries from DiskBoundaries" debug.log*  | wc -l
3249





On Mon, Jun 1, 2020 at 5:01 PM Erick Ramirez <er...@datastax.com>> wrote:
There's quite a lot of steps that takes place during the startup sequence between these 2 lines:

INFO  [main] 2020-05-31 23:51:15,555 Gossiper.java:1723 - No gossip backlog; proceeding
INFO  [main] 2020-05-31 23:54:06,867 NativeTransportService.java:70 - Netty using native Epoll event loop

For the most part, it's taken up by CompactionStrategyManager and DiskBoundaryManager. If you check debug.log, you'll see that it's mostly updating disk boundaries. The length of time it takes is proportional to the number of tables in the cluster.

Have a look at this section [1] of CassandraDaemon if you're interested in the details of the startup sequence. Cheers!

[1] https://github.com/apache/cassandra/blob/cassandra-3.11.3/src/java/org/apache/cassandra/service/CassandraDaemon.java#L399-L435 [github.com]<https://urldefense.com/v3/__https:/github.com/apache/cassandra/blob/cassandra-3.11.3/src/java/org/apache/cassandra/service/CassandraDaemon.java*L399-L435__;Iw!!M-nmYVHPHQ!dt_R3xGLIK4vc3FdekacgZnl6PDJVAqW_c-yBaIAmQsoVKp7SoW7VeM3gc7VSLx2KgcKBSE$>

________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.

Re: Cassandra Bootstrap Sequence

Posted by Jai Bheemsen Rao Dhanwada <ja...@gmail.com>.
Thanks Erick,

I see below tasks are being run mostly. I didn't quite understand what
exactly these scheduled tasks are for? Is there a way to reduce the boot-up
time or do I have to live with this delay?

$ zgrep "CompactionStrategyManager.java:380 - Recreating compaction
> strategy" debug.log*  | wc -l
> 3249
> $ zgrep "DiskBoundaryManager.java:53 - Refreshing disk boundary cache for"
> debug.log*  | wc -l
> 6293
> $ zgrep "DiskBoundaryManager.java:92 - Got local ranges" debug.log*  | wc
> -l
> 6308
> $ zgrep "DiskBoundaryManager.java:56 - Updating boundaries from
> DiskBoundaries" debug.log*  | wc -l
> 3249






On Mon, Jun 1, 2020 at 5:01 PM Erick Ramirez <er...@datastax.com>
wrote:

> There's quite a lot of steps that takes place during the startup sequence
> between these 2 lines:
>
>
>>> *INFO  [main] 2020-05-31 23:51:15,555 Gossiper.java:1723 - No gossip
>>> backlog; proceeding*INFO  [main] 2020-05-31 23:54:06,867
>>> NativeTransportService.java:70 - Netty using native Epoll event loop
>>>
>>
> For the most part, it's taken up by CompactionStrategyManager and
> DiskBoundaryManager. If you check debug.log, you'll see that it's mostly
> updating disk boundaries. The length of time it takes is proportional to
> the number of tables in the cluster.
>
> Have a look at this section [1] of CassandraDaemon if you're interested
> in the details of the startup sequence. Cheers!
>
> [1]
> https://github.com/apache/cassandra/blob/cassandra-3.11.3/src/java/org/apache/cassandra/service/CassandraDaemon.java#L399-L435
>

Re: Cassandra Bootstrap Sequence

Posted by Erick Ramirez <er...@datastax.com>.
There's quite a lot of steps that takes place during the startup sequence
between these 2 lines:


>> *INFO  [main] 2020-05-31 23:51:15,555 Gossiper.java:1723 - No gossip
>> backlog; proceeding*INFO  [main] 2020-05-31 23:54:06,867
>> NativeTransportService.java:70 - Netty using native Epoll event loop
>>
>
For the most part, it's taken up by CompactionStrategyManager and
DiskBoundaryManager. If you check debug.log, you'll see that it's mostly
updating disk boundaries. The length of time it takes is proportional to
the number of tables in the cluster.

Have a look at this section [1] of CassandraDaemon if you're interested in
the details of the startup sequence. Cheers!

[1]
https://github.com/apache/cassandra/blob/cassandra-3.11.3/src/java/org/apache/cassandra/service/CassandraDaemon.java#L399-L435

Re: Cassandra Bootstrap Sequence

Posted by Reid Pinchback <rp...@tripadvisor.com>.
The thing to look for in GC logs would be signs that you’re bouncing against your memory limits and spending a lot of time in full GC collections.

I’m not sure at what phase it kicks in but definitely there is the potential for memory issues when you have large column families (large in the number of columns I mean), and you’re mentioning that the situation gets worse in proportion to the number of tables brought GC to mind.  Not sure about proportion of nodes, I think there are thread counts that increase with the number of nodes, and increased threads also can add to GC load, particularly in G1GC.

I’m speculating a bit on possible causes, but basically the idea was to look for GC load during those 3 minutes, because if you see it then you’re not hunting for a timeout tuning or anything like that, you’re hunting for a resource allocation tuning.

From: Jai Bheemsen Rao Dhanwada <ja...@gmail.com>
Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Date: Monday, June 1, 2020 at 7:15 PM
To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Subject: Re: Cassandra Bootstrap Sequence

Message from External Sender
Is there anything specific to for in GC logs?
b/w this delay happens always whenever I bootstrap the node or restart a C* process.

I don't believe it's a GC issue and correction from initial question, it's not just bootstrap, but every restart of C* process is causing this.

On Mon, Jun 1, 2020 at 3:22 PM Reid Pinchback <rp...@tripadvisor.com>> wrote:
That gap seems a long time.  Have you checked GC logs around the timeframe?

From: Jai Bheemsen Rao Dhanwada <ja...@gmail.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Monday, June 1, 2020 at 3:52 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Cassandra Bootstrap Sequence

Message from External Sender
Hello Team,

When I am bootstrapping/restarting a Cassandra Node, there is a delay between gossip settle and port opening. Can someone please explain me where this delay is configured and can this be changed? I don't see any information in the logs

In my case if you see there is  a ~3 minutes delay and this increases if I increase the #of tables and #of nodes and DC.

INFO  [main] 2020-05-31 23:51:07,554 Gossiper.java:1692 - Waiting for gossip to settle...
INFO  [main] 2020-05-31 23:51:15,555 Gossiper.java:1723 - No gossip backlog; proceeding
INFO  [main] 2020-05-31 23:54:06,867 NativeTransportService.java:70 - Netty using native Epoll event loop
INFO  [main] 2020-05-31 23:54:06,913 Server.java:155 - Using Netty Version: [netty-buffer=netty-buffer-4.0.44.Final.452812a, netty-codec=netty-codec-4.0.44.Final.452812a, netty-codec-haproxy=netty-codec-haproxy-4.0.44.Final.452812a, netty-codec-http=netty-codec-http-4.0.44.Final.452812a, netty-codec-socks=netty-codec-socks-4.0.44.Final.452812a, netty-common=netty-common-4.0.44.Final.452812a, netty-handler=netty-handler-4.0.44.Final.452812a, netty-tcnative=netty-tcnative-1.1.33.Fork26.142ecbb, netty-transport=netty-transport-4.0.44.Final.452812a, netty-transport-native-epoll=netty-transport-native-epoll-4.0.44.Final.452812a, netty-transport-rxtx=netty-transport-rxtx-4.0.44.Final.452812a, netty-transport-sctp=netty-transport-sctp-4.0.44.Final.452812a, netty-transport-udt=netty-transport-udt-4.0.44.Final.452812a]
INFO  [main] 2020-05-31 23:54:06,913 Server.java:156 - Starting listening for CQL clients on /x.x.x.x:9042 (encrypted)...

Also during this 3 minutes delay, I am losing all my metrics from the C* nodes(basically the metrics are not returned within 10s).

Can someone please help me understand the delay here?

Cassandra Version: 3.11.3
Metrics: Using telegraf to collect metrics.

Re: Cassandra Bootstrap Sequence

Posted by Jai Bheemsen Rao Dhanwada <ja...@gmail.com>.
Is there anything specific to for in GC logs?
b/w this delay happens always whenever I bootstrap the node or restart a C*
process.

I don't believe it's a GC issue and correction from initial question, it's
not just bootstrap, but every restart of C* process is causing this.

On Mon, Jun 1, 2020 at 3:22 PM Reid Pinchback <rp...@tripadvisor.com>
wrote:

> That gap seems a long time.  Have you checked GC logs around the timeframe?
>
>
>
> *From: *Jai Bheemsen Rao Dhanwada <ja...@gmail.com>
> *Reply-To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Date: *Monday, June 1, 2020 at 3:52 PM
> *To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Subject: *Cassandra Bootstrap Sequence
>
>
>
> *Message from External Sender*
>
> Hello Team,
>
>
>
> When I am bootstrapping/restarting a Cassandra Node, there is a delay
> between gossip settle and port opening. Can someone please explain me where
> this delay is configured and can this be changed? I don't see any
> information in the logs
>
>
>
> In my case if you see there is  a ~3 minutes delay and this increases if I
> increase the #of tables and #of nodes and DC.
>
>
>
> INFO  [main] 2020-05-31 23:51:07,554 Gossiper.java:1692 - Waiting for
> gossip to settle...
>
> *INFO  [main] 2020-05-31 23:51:15,555 Gossiper.java:1723 - No gossip
> backlog; proceeding *INFO  [main] 2020-05-31 23:54:06,867
> NativeTransportService.java:70 - Netty using native Epoll event loop
> INFO  [main] 2020-05-31 23:54:06,913 Server.java:155 - Using Netty
> Version: [netty-buffer=netty-buffer-4.0.44.Final.452812a,
> netty-codec=netty-codec-4.0.44.Final.452812a,
> netty-codec-haproxy=netty-codec-haproxy-4.0.44.Final.452812a,
> netty-codec-http=netty-codec-http-4.0.44.Final.452812a,
> netty-codec-socks=netty-codec-socks-4.0.44.Final.452812a,
> netty-common=netty-common-4.0.44.Final.452812a,
> netty-handler=netty-handler-4.0.44.Final.452812a,
> netty-tcnative=netty-tcnative-1.1.33.Fork26.142ecbb,
> netty-transport=netty-transport-4.0.44.Final.452812a,
> netty-transport-native-epoll=netty-transport-native-epoll-4.0.44.Final.452812a,
> netty-transport-rxtx=netty-transport-rxtx-4.0.44.Final.452812a,
> netty-transport-sctp=netty-transport-sctp-4.0.44.Final.452812a,
> netty-transport-udt=netty-transport-udt-4.0.44.Final.452812a]
> *INFO  [main] 2020-05-31 23:54:06,913 Server.java:156 - Starting listening
> for CQL clients on /x.x.x.x:9042 (encrypted)...*
>
>
>
> Also during this 3 minutes delay, I am losing all my metrics from the C*
> nodes(basically the metrics are not returned within 10s).
>
>
>
> Can someone please help me understand the delay here?
>
>
>
> Cassandra Version: 3.11.3
>
> Metrics: Using telegraf to collect metrics.
>

Re: Cassandra Bootstrap Sequence

Posted by Reid Pinchback <rp...@tripadvisor.com>.
That gap seems a long time.  Have you checked GC logs around the timeframe?

From: Jai Bheemsen Rao Dhanwada <ja...@gmail.com>
Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Date: Monday, June 1, 2020 at 3:52 PM
To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Subject: Cassandra Bootstrap Sequence

Message from External Sender
Hello Team,

When I am bootstrapping/restarting a Cassandra Node, there is a delay between gossip settle and port opening. Can someone please explain me where this delay is configured and can this be changed? I don't see any information in the logs

In my case if you see there is  a ~3 minutes delay and this increases if I increase the #of tables and #of nodes and DC.

INFO  [main] 2020-05-31 23:51:07,554 Gossiper.java:1692 - Waiting for gossip to settle...
INFO  [main] 2020-05-31 23:51:15,555 Gossiper.java:1723 - No gossip backlog; proceeding
INFO  [main] 2020-05-31 23:54:06,867 NativeTransportService.java:70 - Netty using native Epoll event loop
INFO  [main] 2020-05-31 23:54:06,913 Server.java:155 - Using Netty Version: [netty-buffer=netty-buffer-4.0.44.Final.452812a, netty-codec=netty-codec-4.0.44.Final.452812a, netty-codec-haproxy=netty-codec-haproxy-4.0.44.Final.452812a, netty-codec-http=netty-codec-http-4.0.44.Final.452812a, netty-codec-socks=netty-codec-socks-4.0.44.Final.452812a, netty-common=netty-common-4.0.44.Final.452812a, netty-handler=netty-handler-4.0.44.Final.452812a, netty-tcnative=netty-tcnative-1.1.33.Fork26.142ecbb, netty-transport=netty-transport-4.0.44.Final.452812a, netty-transport-native-epoll=netty-transport-native-epoll-4.0.44.Final.452812a, netty-transport-rxtx=netty-transport-rxtx-4.0.44.Final.452812a, netty-transport-sctp=netty-transport-sctp-4.0.44.Final.452812a, netty-transport-udt=netty-transport-udt-4.0.44.Final.452812a]
INFO  [main] 2020-05-31 23:54:06,913 Server.java:156 - Starting listening for CQL clients on /x.x.x.x:9042 (encrypted)...

Also during this 3 minutes delay, I am losing all my metrics from the C* nodes(basically the metrics are not returned within 10s).

Can someone please help me understand the delay here?

Cassandra Version: 3.11.3
Metrics: Using telegraf to collect metrics.