You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Steven McAllister <st...@gmail.com> on 2017/05/22 18:54:01 UTC

Connection Pooling for Calcite

Apologies if this question has been asked already.  I am experimenting with using Calcite to do a distributed join across two data sources (one Solr and one MongoDB).  I have a working example using CalciteConnection with the Solr and MongoDB adapters that executes a simple join, and am creating a REST API wrapper on top of it so that I can easily run scale and performance tests with different queries.  As part of this, the thought crossed my mind that it might be useful to implement a connection pool for CalciteConnections, so I created a simple connection pool using DBCP.  However, when I attempt to get a connection from the data source, I get the following exception:

java.sql.SQLException: Cannot create PoolableConnectionFactory (null)
	at org.apache.commons.dbcp2.BasicDataSource.createPoolableConnectionFactory(BasicDataSource.java:2294)
	at org.apache.commons.dbcp2.BasicDataSource.createDataSource(BasicDataSource.java:2039)
	at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1533)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
	at java.lang.reflect.Method.invoke(Unknown Source)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
	at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:86)
	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:678)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)
Caused by: java.sql.SQLFeatureNotSupportedException
	at org.apache.calcite.avatica.Helper.unsupported(Helper.java:68)
	at org.apache.calcite.avatica.AvaticaConnection.isValid(AvaticaConnection.java:373)
	at org.apache.commons.dbcp2.DelegatingConnection.isValid(DelegatingConnection.java:918)
	at org.apache.commons.dbcp2.PoolableConnection.validate(PoolableConnection.java:283)
	at org.apache.commons.dbcp2.PoolableConnectionFactory.validateConnection(PoolableConnectionFactory.java:357)
	at org.apache.commons.dbcp2.BasicDataSource.validateConnectionFactory(BasicDataSource.java:2307)
	at org.apache.commons.dbcp2.BasicDataSource.createPoolableConnectionFactory(BasicDataSource.java:2290)
	... 26 more

Looking at the code, it seems that Calcite’s AvaticaConnection implementation throws an exception when isValid is called, which is causing this issue since DBCP uses that field for knowing whether the underlying connection is still usable or not.

So I have a couple of questions:

1. Is there benefit in having a connection pool for CalciteConnection?  I haven’t looked through the internals deep enough to know if a connection pool would be helpful in optimizing any sort of operations or resources under the covers.
2. If connection pooling is useful, are there any known patterns that apply a standard connection pool library on top of Calcite?
3. If connection pooling is not useful, are there any concerns regarding thread safety for connections?  i.e. should I be creating a single connection for all queries, or recreating connections on each query?

Thanks,
Steven 

Re: Connection Pooling for Calcite

Posted by Julian Hyde <jh...@apache.org>.
This is a known issue: https://issues.apache.org/jira/browse/CALCITE-1520 <https://issues.apache.org/jira/browse/CALCITE-1520> 

It would be great if someone could fix for avatica-1.10.

> On May 22, 2017, at 11:54 AM, Steven McAllister <st...@gmail.com> wrote:
> 
> Apologies if this question has been asked already.  I am experimenting with using Calcite to do a distributed join across two data sources (one Solr and one MongoDB).  I have a working example using CalciteConnection with the Solr and MongoDB adapters that executes a simple join, and am creating a REST API wrapper on top of it so that I can easily run scale and performance tests with different queries.  As part of this, the thought crossed my mind that it might be useful to implement a connection pool for CalciteConnections, so I created a simple connection pool using DBCP.  However, when I attempt to get a connection from the data source, I get the following exception:
> 
> java.sql.SQLException: Cannot create PoolableConnectionFactory (null)
> 	at org.apache.commons.dbcp2.BasicDataSource.createPoolableConnectionFactory(BasicDataSource.java:2294)
> 	at org.apache.commons.dbcp2.BasicDataSource.createDataSource(BasicDataSource.java:2039)
> 	at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1533)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
> 	at java.lang.reflect.Method.invoke(Unknown Source)
> 	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> 	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> 	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> 	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> 	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> 	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> 	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> 	at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:86)
> 	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
> 	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
> 	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:678)
> 	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
> 	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)
> Caused by: java.sql.SQLFeatureNotSupportedException
> 	at org.apache.calcite.avatica.Helper.unsupported(Helper.java:68)
> 	at org.apache.calcite.avatica.AvaticaConnection.isValid(AvaticaConnection.java:373)
> 	at org.apache.commons.dbcp2.DelegatingConnection.isValid(DelegatingConnection.java:918)
> 	at org.apache.commons.dbcp2.PoolableConnection.validate(PoolableConnection.java:283)
> 	at org.apache.commons.dbcp2.PoolableConnectionFactory.validateConnection(PoolableConnectionFactory.java:357)
> 	at org.apache.commons.dbcp2.BasicDataSource.validateConnectionFactory(BasicDataSource.java:2307)
> 	at org.apache.commons.dbcp2.BasicDataSource.createPoolableConnectionFactory(BasicDataSource.java:2290)
> 	... 26 more
> 
> Looking at the code, it seems that Calcite’s AvaticaConnection implementation throws an exception when isValid is called, which is causing this issue since DBCP uses that field for knowing whether the underlying connection is still usable or not.
> 
> So I have a couple of questions:
> 
> 1. Is there benefit in having a connection pool for CalciteConnection?  I haven’t looked through the internals deep enough to know if a connection pool would be helpful in optimizing any sort of operations or resources under the covers.
> 2. If connection pooling is useful, are there any known patterns that apply a standard connection pool library on top of Calcite?
> 3. If connection pooling is not useful, are there any concerns regarding thread safety for connections?  i.e. should I be creating a single connection for all queries, or recreating connections on each query?
> 
> Thanks,
> Steven