You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Andrew Kyle Purtell (Jira)" <ji...@apache.org> on 2022/06/12 17:56:00 UTC

[jira] [Created] (HBASE-27112) Investigate Netty resource usage limits

Andrew Kyle Purtell created HBASE-27112:
-------------------------------------------

             Summary: Investigate Netty resource usage limits
                 Key: HBASE-27112
                 URL: https://issues.apache.org/jira/browse/HBASE-27112
             Project: HBase
          Issue Type: Sub-task
          Components: IPC/RPC
    Affects Versions: 2.5.0
            Reporter: Andrew Kyle Purtell


We leave Netty level resource limits unbounded. The number of threads to use for the event loop is default 0 (unbounded). The default for io.netty.eventLoop.maxPendingTasks is INT_MAX. 

We don't do that for our own RPC handlers. We have a notion of maximum handler pool size, with a default of 30, typically raised in production by the user. We constrain the depth of the request queue in multiple ways... limits on number of queued calls, limits on total size of calls data that can be queued (to avoid memory usage overrun, just like this case), CoDel conditioning of the call queues if it is enabled, and so on.

Under load can we pile up a excess of pending request state, such as direct buffers containing request bytes, at the netty layer because of downstream resource limits? Those limits will act as a bottleneck, as intended, and before would have also applied backpressure through RPC too, because SimpleRpcServer had thread limits ("hbase.ipc.server.read.threadpool.size", default 10), but Netty may be able to queue up a lot more, in comparison, because Netty has been optimized to prefer concurrency.

Consider the hbase.netty.eventloop.rpcserver.thread.count default. It is 0 (unbounded). I don't know what it can actually get up to in production, because we lack the metric, but there are diminishing returns when threads > cores so a reasonable default here could be Runtime.getRuntime().availableProcessors() instead of unbounded?

maxPendingTasks probably should not be INT_MAX, but that may matter less.

The tasks here are:
- Instrument netty level resources to understand better actual resource allocations under load. Investigate what we need to plug in where to gain visibility. 
- Where instrumentation designed for this issue can be implemented as low overhead metrics, consider formally adding them as a metric. 
- Based on the findings from this instrumentation, consider and implement next steps. The goal would be to limit concurrency at the Netty layer in such a way that performance is still good, and under load we don't balloon resource usage at the Netty layer.

If the instrumentation and experimental results indicate no changes are necessary, we can close this as Not A Problem or WontFix. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)