You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jaydeep Chovatia <ch...@gmail.com> on 2022/03/23 04:29:39 UTC

Cassandra 3.0.14 transport completely blocked

Hi,

I have been using Cassandra 3.0.14 in production for a long time. Recently
I have found a bug in that, all of a sudden the transport thread-pool
hangs.

*Observation:*
If I do *nodetool tpstats*, then it shows *"Native-Transport-Requests"* is
blocking "Active" tasks. I stopped the complete traffic, and sent a very
light load, but still my requests are getting denied, and active transport
blocked tasks keep happening.

*Fix:*
If I restart my cluster, then everything works fine, which means there
might be some deadlock, etc. in the system.


Is anyone aware of this issue? I know there have been quite a lot of fixes
on top of 3.0.14, is there any specific fix that addresses this particular
issue?

Any help would be appreciated.

Yours Sincerely,
Jaydeep

Re: Cassandra 3.0.14 transport completely blocked

Posted by Erick Ramirez <er...@apache.org>.
>
> Thanks, Scott, for the prompt response! We will apply this patch and see
> how it goes.
> Also, in the near future, we will consider upgrading to 3.0.26 and
> eventually to 4.0
>

We would really discourage you from just upgrading to C* 3.0.21. There
really is no logical reason for doing that. If you're going to the trouble
of upgrading the binaries, you might as well go all the way to C* 3.0.26
since it's a prerequisite to eventually upgrading to C* 4.0. Cheers!

Re: Cassandra 3.0.14 transport completely blocked

Posted by Jaydeep Chovatia <ch...@gmail.com>.
Thanks, Scott, for the prompt response! We will apply this patch and see
how it goes.
Also, in the near future, we will consider upgrading to 3.0.26 and
eventually to 4.0
Thanks a lot!

On Tue, Mar 22, 2022 at 9:45 PM C. Scott Andreas <sc...@paradoxica.net>
wrote:

> Hi Jaydeep, thanks for reaching out.
>
> The most notable deadlock identified and resolved in the last few years is
> https://issues.apache.org/jira/browse/CASSANDRA-15367: Memtable memory
> allocations may deadlock (fixed in Apache Cassandra 3.0.21).
>
> Mentioning for completeness - since the release of Cassandra 3.0.14
> several years ago, many critical bugs whose consequences include data loss
> have been resolved. I'd strongly recommend upgrading to 3.0.26 - and
> ideally to 4.0 after you've confirmed behavior is as expected on 3.0.26.
>
> – Scott
>
> On Mar 22, 2022, at 9:30 PM, Jaydeep Chovatia <ch...@gmail.com>
> wrote:
>
>
> Hi,
>
> I have been using Cassandra 3.0.14 in production for a long time. Recently
> I have found a bug in that, all of a sudden the transport thread-pool
> hangs.
>
> *Observation:*
> If I do *nodetool tpstats*, then it shows *"Native-Transport-Requests"*
> is blocking "Active" tasks. I stopped the complete traffic, and sent a very
> light load, but still my requests are getting denied, and active transport
> blocked tasks keep happening.
>
> *Fix:*
> If I restart my cluster, then everything works fine, which means there
> might be some deadlock, etc. in the system.
>
>
> Is anyone aware of this issue? I know there have been quite a lot of fixes
> on top of 3.0.14, is there any specific fix that addresses this particular
> issue?
>
> Any help would be appreciated.
>
> Yours Sincerely,
> Jaydeep
>
>
>
>
>

Re: Cassandra 3.0.14 transport completely blocked

Posted by "C. Scott Andreas" <sc...@paradoxica.net>.
Hi Jaydeep, thanks for reaching out.The most notable deadlock identified and resolved in the last few years is https://issues.apache.org/jira/browse/CASSANDRA-15367: Memtable memory allocations may deadlock (fixed in Apache Cassandra 3.0.21).Mentioning for completeness - since the release of Cassandra 3.0.14 several years ago, many critical bugs whose consequences include data loss have been resolved. I'd strongly recommend upgrading to 3.0.26 - and ideally to 4.0 after you've confirmed behavior is as expected on 3.0.26.– ScottOn Mar 22, 2022, at 9:30 PM, Jaydeep Chovatia <ch...@gmail.com> wrote:Hi,I have been using Cassandra 3.0.14 in production for a long time. Recently I have found a bug in that, all of a sudden the transport thread-pool hangs. Observation: If I do nodetool tpstats, then it shows "Native-Transport-Requests" is blocking "Active" tasks. I stopped the complete traffic, and sent a very light load, but still my requests are getting denied, and active transport blocked tasks keep happening.Fix:If I restart my cluster, then everything works fine, which means there might be some deadlock, etc. in the system.Is anyone aware of this issue? I know there have been quite a lot of fixes on top of 3.0.14, is there any specific fix that addresses this particular issue?Any help would be appreciated. Yours Sincerely,Jaydeep

Re: Cassandra 3.0.14 transport completely blocked

Posted by Jaydeep Chovatia <ch...@gmail.com>.
Thank you all. I will try different options and will let you know which one
worked for my case.

On Wed, Mar 23, 2022 at 3:25 AM Bowen Song <bo...@bso.ng> wrote:

> I remember we had the same issue back in the Cassandra 2.x days, and
> restarting the affected node only makes the issue go away temporarily. The
> issue we had was "fixed" by adding
> "-Dcassandra.max_queued_native_transport_requests=4096" to the JVM options.
> I dug that option out from our old Ansible playbook. Now, after so many
> years, I've long forgotten what does that option do.
>
> Please seriously consider upgrade your Cassandra cluster to the least
> version. I can't tell which exact version fixed this bug, but we had
> removed this from our servers many years ago after several rounds of
> upgrades, and we have not had the NTR pool blocking issue coming back.
> On 23/03/2022 04:29, Jaydeep Chovatia wrote:
>
> Hi,
>
> I have been using Cassandra 3.0.14 in production for a long time. Recently
> I have found a bug in that, all of a sudden the transport thread-pool
> hangs.
>
> *Observation:*
> If I do *nodetool tpstats*, then it shows *"Native-Transport-Requests"*
> is blocking "Active" tasks. I stopped the complete traffic, and sent a very
> light load, but still my requests are getting denied, and active transport
> blocked tasks keep happening.
>
> *Fix:*
> If I restart my cluster, then everything works fine, which means there
> might be some deadlock, etc. in the system.
>
>
> Is anyone aware of this issue? I know there have been quite a lot of fixes
> on top of 3.0.14, is there any specific fix that addresses this particular
> issue?
>
> Any help would be appreciated.
>
> Yours Sincerely,
> Jaydeep
>
>

Re: Cassandra 3.0.14 transport completely blocked

Posted by Bowen Song <bo...@bso.ng>.
I remember we had the same issue back in the Cassandra 2.x days, and 
restarting the affected node only makes the issue go away temporarily. 
The issue we had was "fixed" by adding 
"-Dcassandra.max_queued_native_transport_requests=4096" to the JVM 
options. I dug that option out from our old Ansible playbook. Now, after 
so many years, I've long forgotten what does that option do.

Please seriously consider upgrade your Cassandra cluster to the least 
version. I can't tell which exact version fixed this bug, but we had 
removed this from our servers many years ago after several rounds of 
upgrades, and we have not had the NTR pool blocking issue coming back.

On 23/03/2022 04:29, Jaydeep Chovatia wrote:
> Hi,
>
> I have been using Cassandra 3.0.14 in production for a long time. 
> Recently I have found a bug in that, all of a sudden the transport 
> thread-pool hangs.
>
> *_Observation:_*
> If I do /nodetool tpstats/, then it shows 
> /"Native-Transport-Requests"/ is blocking "Active" tasks. I stopped 
> the complete traffic, and sent a very light load, but still my 
> requests are getting denied, and active transport blocked tasks keep 
> happening.
>
> _*Fix:*_
> If I restart my cluster, then everything works fine, which means there 
> might be some deadlock, etc. in the system.
>
>
> Is anyone aware of this issue? I know there have been quite a lot of 
> fixes on top of 3.0.14, is there any specific fix that addresses this 
> particular issue?
>
> Any help would be appreciated.
>
> Yours Sincerely,
> Jaydeep