You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Bryan Beaudreault (Jira)" <ji...@apache.org> on 2022/07/20 15:16:00 UTC

[jira] [Created] (HBASE-27228) Client connection warming API

Bryan Beaudreault created HBASE-27228:
-----------------------------------------

             Summary: Client connection warming API
                 Key: HBASE-27228
                 URL: https://issues.apache.org/jira/browse/HBASE-27228
             Project: HBase
          Issue Type: Improvement
            Reporter: Bryan Beaudreault


In a high performance API or low latency stream workers, you often do not want to incur costs on the first few requests. In these cases, you want to warm connections before ever adding to the load balancer or processing group.

Upon first creating a Connection, there are two areas that can slow down the first few requests:
 * Fetching region locations
 * Creating the initial connection to each RegionServer, which sends connection headers, possibly does auth handshakes, etc.

A user can easily work around the first slowness by calling Table.getRegionLocator().getAllRegionLocations().

It's more challenging for a user to warm the actual RegionServer connections. One way we have done this is to use a RegionLocator to fetch all locations for a table, reduce that down to 1 region per server, and then issue a Get to each row. We end up repeating this for every table that a process may connect to, because at the level we do this we can't easily tell which servers have already been warmed. We also have run into various bugs over time, for example where an empty startkey causes a Get to fail.

We can make this easier for the users by providing an API which uses Connection internals to as cheaply as possible warm these connections. I'd propose we add the following:

New Table/AsyncTable method {{{}warmConnections(){}}}. This would do the following:
 * use region locator to fetch all locations (with caching)
 * reduce returned locations to unique ServerNames
 * for each ServerName (with lock):
 ** if already warmed, skip
 ** otherwise, get a connection to that server and send an initial request to trigger socket creation/connection header/etc

With this API, if someone is connecting to multiple tables, they could warm each of them Table in parallel and we'd only create connections to each server once. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)