You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Gil Ganz <gi...@gmail.com> on 2021/03/08 11:47:01 UTC

Node removal causes spike in pending native-transport requests and clients suffer

Hey,
We have a 3.11.9 cluster (recently upgraded from 2.1.14), and after the
upgrade we have an issue when we remove a node.

The moment I run the removenode command, 3 servers in the same dc start to
have a high amount of pending native-transport-requests (getting to around
1M) and clients are having issues due to that. We are using vnodes (32), so
I I don't see why I would have 3 servers busier than others (RF is 3 but I
don't see why it will be related).

Each node has a few TB of data, and in the past we were able to remove a
node in ~half a day, today what happens is in the first 1-2 hours we have
these issues with some nodes, then things go quite, remove is still running
and clients are ok, a few hours later the same issue is back (with same
nodes as the problematic ones), and clients have issues again, leading us
to run removenode force.

Reducing stream throughput and number of compactors has helped to mitigate
the issues a bit, but we still have this issue of pending native-transport
requests getting to insane numbers and clients suffering, eventually
causing us to run remove force. Any idea?

I saw since 3.11.6 there is a parameter
native_transport_max_concurrent_requests_in_bytes, looking into setting
this, perhaps this will prevent the amount of pending tasks to get so high.

Gil

Re: Node removal causes spike in pending native-transport requests and clients suffer

Posted by Gil Ganz <gi...@gmail.com>.

Hey Bowen
I agree it's better to have smaller servers in general, this is the smaller
servers version :)
In this case, I wouldn't say the data model is bad, and we certainly do our
best to tune everything so less hardware is needed.
It's just that the data and amount of requests/s is very big to begin with,
multiple datacenters around the world (on-prem), with each datacenter
having close to 100 servers.
Making the servers smaller would mean a very large cluster, which has other
implications when it's on-prem.


On Fri, Mar 12, 2021 at 1:30 AM Bowen Song <bo...@bso.ng.invalid> wrote:

> May I ask why do you scale your Cassandra cluster vertically instead of
> horizontally as recommended?
>
> I'm asking because I had dealt with a vertically scaled cluster before. It
> was because they had query performance issue and blamed the hardware wasn't
> strong enough. Scaling vertically had helped them to improve the query
> performance, but it turned out the root caused was bad data modelling, and
> it gradually got worse with the ever increasing data size. Eventually they
> reached the roof of what money can realistically buy - 256GB RAM and 16
> cores 3.x GHz CPU per server in their case.
>
> Is that your case too? Bigger RAM, more cores and higher CPU frequency to
> help "fix" the performance issue? I really hope not.
>
>
> On 11/03/2021 09:57, Gil Ganz wrote:
>
> Yes. 192gb.
>
> On Thu, Mar 11, 2021 at 10:29 AM Kane Wilson <k...@raft.so> <k...@raft.so>
> wrote:
>
>> That is a very large heap.  I presume you are using G1GC? How much memory
>> do your servers have?
>>
>> raft.so - Cassandra consulting, support, managed services
>>
>> On Thu., 11 Mar. 2021, 18:29 Gil Ganz, <gi...@gmail.com> wrote:
>>
>>> I always prefer to do decommission, but the issue here  is these servers
>>> are on-prem, and disks die from time to time.
>>> It's a very large cluster, in multiple datacenters around the world, so
>>> it can take some time before we have a replacement, so we usually need to
>>> run removenode in such cases.
>>>
>>> Other than that there are no issues in the cluster, the load is
>>> reasonable, and when this issue happens, following a removenode, this huge
>>> number of NTR is what I see, weird thing it's only on some nodes.
>>> I have been running with a very small
>>> native_transport_max_concurrent_requests_in_bytes  setting for a few days
>>> now on some nodes (few mb's compared to the default 0.8 of a 60gb heap), it
>>> looks like it's good enough for the app, will roll it out to the entire dc
>>> and test removal again.
>>>
>>>
>>> On Tue, Mar 9, 2021 at 10:51 AM Kane Wilson <k...@raft.so> <k...@raft.so>
>>> wrote:
>>>
>>>> It's unlikely to help in this case, but you should be using nodetool
>>>> decommission on the node you want to remove rather than removenode from
>>>> another node (and definitely don't force removal)
>>>>
>>>> native_transport_max_concurrent_requests_in_bytes defaults to 10% of
>>>> the heap, which I suppose depending on your configuration could potentially
>>>> result in a smaller number of concurrent requests than previously. It's
>>>> worth a shot setting it higher to see if the issue is related. Is this the
>>>> only issue you see on the cluster? I assume load on the cluster is still
>>>> low/reasonable and the only symptom you're seeing is the increased NTR
>>>> requests?
>>>>
>>>> raft.so - Cassandra consulting, support, and managed services
>>>>
>>>>
>>>> On Mon, Mar 8, 2021 at 10:47 PM Gil Ganz <gi...@gmail.com> wrote:
>>>>
>>>>>
>>>>> Hey,
>>>>> We have a 3.11.9 cluster (recently upgraded from 2.1.14), and after
>>>>> the upgrade we have an issue when we remove a node.
>>>>>
>>>>> The moment I run the removenode command, 3 servers in the same dc
>>>>> start to have a high amount of pending native-transport-requests (getting
>>>>> to around 1M) and clients are having issues due to that. We are using
>>>>> vnodes (32), so I I don't see why I would have 3 servers busier than others
>>>>> (RF is 3 but I don't see why it will be related).
>>>>>
>>>>> Each node has a few TB of data, and in the past we were able to remove
>>>>> a node in ~half a day, today what happens is in the first 1-2 hours we have
>>>>> these issues with some nodes, then things go quite, remove is still running
>>>>> and clients are ok, a few hours later the same issue is back (with same
>>>>> nodes as the problematic ones), and clients have issues again, leading us
>>>>> to run removenode force.
>>>>>
>>>>> Reducing stream throughput and number of compactors has helped
>>>>> to mitigate the issues a bit, but we still have this issue of pending
>>>>> native-transport requests getting to insane numbers and clients suffering,
>>>>> eventually causing us to run remove force. Any idea?
>>>>>
>>>>> I saw since 3.11.6 there is a parameter
>>>>> native_transport_max_concurrent_requests_in_bytes, looking into setting
>>>>> this, perhaps this will prevent the amount of pending tasks to get so high.
>>>>>
>>>>> Gil
>>>>>
>>>>

Re: Node removal causes spike in pending native-transport requests and clients suffer

Posted by Bowen Song <bo...@bso.ng.INVALID>.

May I ask why do you scale your Cassandra cluster vertically instead of 
horizontally as recommended?

I'm asking because I had dealt with a vertically scaled cluster before. 
It was because they had query performance issue and blamed the hardware 
wasn't strong enough. Scaling vertically had helped them to improve the 
query performance, but it turned out the root caused was bad data 
modelling, and it gradually got worse with the ever increasing data 
size. Eventually they reached the roof of what money can realistically 
buy - 256GB RAM and 16 cores 3.x GHz CPU per server in their case.

Is that your case too? Bigger RAM, more cores and higher CPU frequency 
to help "fix" the performance issue? I really hope not.


On 11/03/2021 09:57, Gil Ganz wrote:
> Yes. 192gb.
>
> On Thu, Mar 11, 2021 at 10:29 AM Kane Wilson <k...@raft.so> wrote:
>
>     That is a very large heap.  I presume you are using G1GC? How much
>     memory do your servers have?
>
>     raft.so - Cassandra consulting, support, managed services
>
>     On Thu., 11 Mar. 2021, 18:29 Gil Ganz, <gilganz@gmail.com
>     <ma...@gmail.com>> wrote:
>
>         I always prefer to do decommission, but the issue here  is
>         these servers are on-prem, and disks die from time to time.
>         It's a very large cluster, in multiple datacenters around the
>         world, so it can take some time before we have a replacement,
>         so we usually need to run removenode in such cases.
>
>         Other than that there are no issues in the cluster, the load
>         is reasonable, and when this issue happens, following a
>         removenode, this huge number of NTR is what I see, weird thing
>         it's only on some nodes.
>         I have been running with a very small
>         native_transport_max_concurrent_requests_in_bytes setting for
>         a few days now on some nodes (few mb's compared to the default
>         0.8 of a 60gb heap), it looks like it's good enough for the
>         app, will roll it out to the entire dc and test removal again.
>
>
>         On Tue, Mar 9, 2021 at 10:51 AM Kane Wilson <k...@raft.so> wrote:
>
>             It's unlikely to help in this case, but you should be
>             using nodetool decommission on the node you want to remove
>             rather than removenode from another node (and definitely
>             don't force removal)
>
>             native_transport_max_concurrent_requests_in_bytes defaults
>             to 10% of the heap, which I suppose depending on your
>             configuration could potentially result in a smaller number
>             of concurrent requests than previously. It's worth a shot
>             setting it higher to see if the issue is related. Is this
>             the only issue you see on the cluster? I assume load on
>             the cluster is still low/reasonable and the only symptom
>             you're seeing is the increased NTR requests?
>
>             raft.so <https://raft.so> - Cassandra consulting, support,
>             and managed services
>
>
>             On Mon, Mar 8, 2021 at 10:47 PM Gil Ganz
>             <gilganz@gmail.com <ma...@gmail.com>> wrote:
>
>
>                 Hey,
>                 We have a 3.11.9 cluster (recently upgraded from
>                 2.1.14), and after the upgrade we have an issue when
>                 we remove a node.
>
>                 The moment I run the removenode command, 3 servers in
>                 the same dc start to have a high amount of pending
>                 native-transport-requests (getting to around 1M) and
>                 clients are having issues due to that. We are using
>                 vnodes (32), so I I don't see why I would have 3
>                 servers busier than others (RF is 3 but I don't see
>                 why it will be related).
>
>                 Each node has a few TB of data, and in the past we
>                 were able to remove a node in ~half a day, today what
>                 happens is in the first 1-2 hours we have these issues
>                 with some nodes, then things go quite, remove is still
>                 running and clients are ok, a few hours later the same
>                 issue is back (with same nodes as the problematic
>                 ones), and clients have issues again, leading us to
>                 run removenode force.
>
>                 Reducing stream throughput and number of compactors
>                 has helped to mitigate the issues a bit, but we still
>                 have this issue of pending native-transport requests
>                 getting to insane numbers and clients suffering,
>                 eventually causing us to run remove force. Any idea?
>
>                 I saw since 3.11.6 there is a parameter
>                 native_transport_max_concurrent_requests_in_bytes,
>                 looking into setting this, perhaps this will prevent
>                 the amount of pending tasks to get so high.
>
>                 Gil
>

Re: Node removal causes spike in pending native-transport requests and clients suffer

Posted by Gil Ganz <gi...@gmail.com>.

Yes. 192gb.

On Thu, Mar 11, 2021 at 10:29 AM Kane Wilson <k...@raft.so> wrote:

> That is a very large heap.  I presume you are using G1GC? How much memory
> do your servers have?
>
> raft.so - Cassandra consulting, support, managed services
>
> On Thu., 11 Mar. 2021, 18:29 Gil Ganz, <gi...@gmail.com> wrote:
>
>> I always prefer to do decommission, but the issue here  is these servers
>> are on-prem, and disks die from time to time.
>> It's a very large cluster, in multiple datacenters around the world, so
>> it can take some time before we have a replacement, so we usually need to
>> run removenode in such cases.
>>
>> Other than that there are no issues in the cluster, the load is
>> reasonable, and when this issue happens, following a removenode, this huge
>> number of NTR is what I see, weird thing it's only on some nodes.
>> I have been running with a very small
>> native_transport_max_concurrent_requests_in_bytes  setting for a few days
>> now on some nodes (few mb's compared to the default 0.8 of a 60gb heap), it
>> looks like it's good enough for the app, will roll it out to the entire dc
>> and test removal again.
>>
>>
>> On Tue, Mar 9, 2021 at 10:51 AM Kane Wilson <k...@raft.so> wrote:
>>
>>> It's unlikely to help in this case, but you should be using nodetool
>>> decommission on the node you want to remove rather than removenode from
>>> another node (and definitely don't force removal)
>>>
>>> native_transport_max_concurrent_requests_in_bytes defaults to 10% of the
>>> heap, which I suppose depending on your configuration could potentially
>>> result in a smaller number of concurrent requests than previously. It's
>>> worth a shot setting it higher to see if the issue is related. Is this the
>>> only issue you see on the cluster? I assume load on the cluster is still
>>> low/reasonable and the only symptom you're seeing is the increased NTR
>>> requests?
>>>
>>> raft.so - Cassandra consulting, support, and managed services
>>>
>>>
>>> On Mon, Mar 8, 2021 at 10:47 PM Gil Ganz <gi...@gmail.com> wrote:
>>>
>>>>
>>>> Hey,
>>>> We have a 3.11.9 cluster (recently upgraded from 2.1.14), and after the
>>>> upgrade we have an issue when we remove a node.
>>>>
>>>> The moment I run the removenode command, 3 servers in the same dc start
>>>> to have a high amount of pending native-transport-requests (getting to
>>>> around 1M) and clients are having issues due to that. We are using vnodes
>>>> (32), so I I don't see why I would have 3 servers busier than others (RF is
>>>> 3 but I don't see why it will be related).
>>>>
>>>> Each node has a few TB of data, and in the past we were able to remove
>>>> a node in ~half a day, today what happens is in the first 1-2 hours we have
>>>> these issues with some nodes, then things go quite, remove is still running
>>>> and clients are ok, a few hours later the same issue is back (with same
>>>> nodes as the problematic ones), and clients have issues again, leading us
>>>> to run removenode force.
>>>>
>>>> Reducing stream throughput and number of compactors has helped
>>>> to mitigate the issues a bit, but we still have this issue of pending
>>>> native-transport requests getting to insane numbers and clients suffering,
>>>> eventually causing us to run remove force. Any idea?
>>>>
>>>> I saw since 3.11.6 there is a parameter
>>>> native_transport_max_concurrent_requests_in_bytes, looking into setting
>>>> this, perhaps this will prevent the amount of pending tasks to get so high.
>>>>
>>>> Gil
>>>>
>>>

Re: Node removal causes spike in pending native-transport requests and clients suffer

Posted by Kane Wilson <k...@raft.so>.

That is a very large heap.  I presume you are using G1GC? How much memory
do your servers have?

raft.so - Cassandra consulting, support, managed services

On Thu., 11 Mar. 2021, 18:29 Gil Ganz, <gi...@gmail.com> wrote:

> I always prefer to do decommission, but the issue here  is these servers
> are on-prem, and disks die from time to time.
> It's a very large cluster, in multiple datacenters around the world, so it
> can take some time before we have a replacement, so we usually need to run
> removenode in such cases.
>
> Other than that there are no issues in the cluster, the load is
> reasonable, and when this issue happens, following a removenode, this huge
> number of NTR is what I see, weird thing it's only on some nodes.
> I have been running with a very small
> native_transport_max_concurrent_requests_in_bytes  setting for a few days
> now on some nodes (few mb's compared to the default 0.8 of a 60gb heap), it
> looks like it's good enough for the app, will roll it out to the entire dc
> and test removal again.
>
>
> On Tue, Mar 9, 2021 at 10:51 AM Kane Wilson <k...@raft.so> wrote:
>
>> It's unlikely to help in this case, but you should be using nodetool
>> decommission on the node you want to remove rather than removenode from
>> another node (and definitely don't force removal)
>>
>> native_transport_max_concurrent_requests_in_bytes defaults to 10% of the
>> heap, which I suppose depending on your configuration could potentially
>> result in a smaller number of concurrent requests than previously. It's
>> worth a shot setting it higher to see if the issue is related. Is this the
>> only issue you see on the cluster? I assume load on the cluster is still
>> low/reasonable and the only symptom you're seeing is the increased NTR
>> requests?
>>
>> raft.so - Cassandra consulting, support, and managed services
>>
>>
>> On Mon, Mar 8, 2021 at 10:47 PM Gil Ganz <gi...@gmail.com> wrote:
>>
>>>
>>> Hey,
>>> We have a 3.11.9 cluster (recently upgraded from 2.1.14), and after the
>>> upgrade we have an issue when we remove a node.
>>>
>>> The moment I run the removenode command, 3 servers in the same dc start
>>> to have a high amount of pending native-transport-requests (getting to
>>> around 1M) and clients are having issues due to that. We are using vnodes
>>> (32), so I I don't see why I would have 3 servers busier than others (RF is
>>> 3 but I don't see why it will be related).
>>>
>>> Each node has a few TB of data, and in the past we were able to remove a
>>> node in ~half a day, today what happens is in the first 1-2 hours we have
>>> these issues with some nodes, then things go quite, remove is still running
>>> and clients are ok, a few hours later the same issue is back (with same
>>> nodes as the problematic ones), and clients have issues again, leading us
>>> to run removenode force.
>>>
>>> Reducing stream throughput and number of compactors has helped
>>> to mitigate the issues a bit, but we still have this issue of pending
>>> native-transport requests getting to insane numbers and clients suffering,
>>> eventually causing us to run remove force. Any idea?
>>>
>>> I saw since 3.11.6 there is a parameter
>>> native_transport_max_concurrent_requests_in_bytes, looking into setting
>>> this, perhaps this will prevent the amount of pending tasks to get so high.
>>>
>>> Gil
>>>
>>

Re: Node removal causes spike in pending native-transport requests and clients suffer

Posted by Gil Ganz <gi...@gmail.com>.

I always prefer to do decommission, but the issue here  is these servers
are on-prem, and disks die from time to time.
It's a very large cluster, in multiple datacenters around the world, so it
can take some time before we have a replacement, so we usually need to run
removenode in such cases.

Other than that there are no issues in the cluster, the load is reasonable,
and when this issue happens, following a removenode, this huge number of
NTR is what I see, weird thing it's only on some nodes.
I have been running with a very small
native_transport_max_concurrent_requests_in_bytes  setting for a few days
now on some nodes (few mb's compared to the default 0.8 of a 60gb heap), it
looks like it's good enough for the app, will roll it out to the entire dc
and test removal again.


On Tue, Mar 9, 2021 at 10:51 AM Kane Wilson <k...@raft.so> wrote:

> It's unlikely to help in this case, but you should be using nodetool
> decommission on the node you want to remove rather than removenode from
> another node (and definitely don't force removal)
>
> native_transport_max_concurrent_requests_in_bytes defaults to 10% of the
> heap, which I suppose depending on your configuration could potentially
> result in a smaller number of concurrent requests than previously. It's
> worth a shot setting it higher to see if the issue is related. Is this the
> only issue you see on the cluster? I assume load on the cluster is still
> low/reasonable and the only symptom you're seeing is the increased NTR
> requests?
>
> raft.so - Cassandra consulting, support, and managed services
>
>
> On Mon, Mar 8, 2021 at 10:47 PM Gil Ganz <gi...@gmail.com> wrote:
>
>>
>> Hey,
>> We have a 3.11.9 cluster (recently upgraded from 2.1.14), and after the
>> upgrade we have an issue when we remove a node.
>>
>> The moment I run the removenode command, 3 servers in the same dc start
>> to have a high amount of pending native-transport-requests (getting to
>> around 1M) and clients are having issues due to that. We are using vnodes
>> (32), so I I don't see why I would have 3 servers busier than others (RF is
>> 3 but I don't see why it will be related).
>>
>> Each node has a few TB of data, and in the past we were able to remove a
>> node in ~half a day, today what happens is in the first 1-2 hours we have
>> these issues with some nodes, then things go quite, remove is still running
>> and clients are ok, a few hours later the same issue is back (with same
>> nodes as the problematic ones), and clients have issues again, leading us
>> to run removenode force.
>>
>> Reducing stream throughput and number of compactors has helped
>> to mitigate the issues a bit, but we still have this issue of pending
>> native-transport requests getting to insane numbers and clients suffering,
>> eventually causing us to run remove force. Any idea?
>>
>> I saw since 3.11.6 there is a parameter
>> native_transport_max_concurrent_requests_in_bytes, looking into setting
>> this, perhaps this will prevent the amount of pending tasks to get so high.
>>
>> Gil
>>
>

Re: Node removal causes spike in pending native-transport requests and clients suffer

Posted by Kane Wilson <k...@raft.so>.

It's unlikely to help in this case, but you should be using nodetool
decommission on the node you want to remove rather than removenode from
another node (and definitely don't force removal)

native_transport_max_concurrent_requests_in_bytes defaults to 10% of the
heap, which I suppose depending on your configuration could potentially
result in a smaller number of concurrent requests than previously. It's
worth a shot setting it higher to see if the issue is related. Is this the
only issue you see on the cluster? I assume load on the cluster is still
low/reasonable and the only symptom you're seeing is the increased NTR
requests?

raft.so - Cassandra consulting, support, and managed services


On Mon, Mar 8, 2021 at 10:47 PM Gil Ganz <gi...@gmail.com> wrote:

>
> Hey,
> We have a 3.11.9 cluster (recently upgraded from 2.1.14), and after the
> upgrade we have an issue when we remove a node.
>
> The moment I run the removenode command, 3 servers in the same dc start to
> have a high amount of pending native-transport-requests (getting to around
> 1M) and clients are having issues due to that. We are using vnodes (32), so
> I I don't see why I would have 3 servers busier than others (RF is 3 but I
> don't see why it will be related).
>
> Each node has a few TB of data, and in the past we were able to remove a
> node in ~half a day, today what happens is in the first 1-2 hours we have
> these issues with some nodes, then things go quite, remove is still running
> and clients are ok, a few hours later the same issue is back (with same
> nodes as the problematic ones), and clients have issues again, leading us
> to run removenode force.
>
> Reducing stream throughput and number of compactors has helped to mitigate
> the issues a bit, but we still have this issue of pending native-transport
> requests getting to insane numbers and clients suffering, eventually
> causing us to run remove force. Any idea?
>
> I saw since 3.11.6 there is a parameter
> native_transport_max_concurrent_requests_in_bytes, looking into setting
> this, perhaps this will prevent the amount of pending tasks to get so high.
>
> Gil
>