You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "James Turton (Jira)" <ji...@apache.org> on 2021/10/20 07:10:00 UTC
[jira] [Created] (DRILL-8016) Default to lazy outbound connections
for storage-jdbc
James Turton created DRILL-8016:
-----------------------------------
Summary: Default to lazy outbound connections for storage-jdbc
Key: DRILL-8016
URL: https://issues.apache.org/jira/browse/DRILL-8016
Project: Apache Drill
Issue Type: Improvement
Components: Storage - JDBC
Affects Versions: 1.19.0
Reporter: James Turton
Assignee: James Turton
Fix For: 1.20.0
Currently we are eager about initiating outbound JDBC
connections, bringing up 10 per storage config per drillbit. For
example, if a user creates 3 storage configs pointing to a single DBMS
(the configs differing in their DB path and credentials, say) on a
cluster of 5 drillbits then we'll bring up 10x3x5 = 150 connections as
soon as we can and try to keep them up permanently. The fixed pool size
of 10 is a default we picked up from HikariCP which surely set it with
application servers in mind.
We've had a report from the field of a MySQL server declining to provide
said 150 connections, leaving the Drill user unable to proceed.
Additionally, as you can imagine, almost all 150 connections will be
idle most of the time for typical Drill cluster workloads. Furthermore,
while connections pools are ubiquitous in the OLTP world they are rare
in the OLAP world where the cost of creating and destroying them is
negligible compared to the cost of a single user query, while the
benefits of per-user access control, resource management and session
management which they bring over shared pools are valuable. Bringing
these latter benefits to Drill's outbound JDBC connections is not in the
scope of this email, the point made is in only "traditionally, OLAP
environments have avoided connection pools because the losses far
outweigh the gains".
In light of the above I suggest that we transition from eager to lazy
outbound JDBC connections, more like Apache Spark (I'm told). I propose
initially that we only change our *default* HikariCP configuration to
maintain small, finitely scalable pools (e.g. baseline 1, up to 10)
instead of fixed pools. The HikariCP configuration is already
overridable today for users that prefer the current eager connection
behaviour.
http://mail-archives.apache.org/mod_mbox/drill-dev/202110.mbox/browser
--
This message was sent by Atlassian Jira
(v8.3.4#803005)