You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/03/09 05:19:00 UTC

[jira] [Work logged] (BEAM-3500) JdbcIO: Improve connection management

     [ https://issues.apache.org/jira/browse/BEAM-3500?focusedWorklogId=78788&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-78788 ]

ASF GitHub Bot logged work on BEAM-3500:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 09/Mar/18 05:18
            Start Date: 09/Mar/18 05:18
    Worklog Time Spent: 10m 
      Work Description: jbonofre commented on a change in pull request #4461: [BEAM-3500] "Attach" JDBC connection to the bundle and add DataSourceFactory allowing full control of the way the DataSource is created
URL: https://github.com/apache/beam/pull/4461#discussion_r173367286
 
 

 ##########
 File path: sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java
 ##########
 @@ -327,12 +428,47 @@ DataSource buildDatasource() throws Exception{
         if (getConnectionProperties() != null && getConnectionProperties().get() != null) {
           basicDataSource.setConnectionProperties(getConnectionProperties().get());
         }
-        return basicDataSource;
+        current = basicDataSource;
       }
+
+      // wrapping the datasource as a pooling datasource
+      DataSourceConnectionFactory connectionFactory = new DataSourceConnectionFactory(current);
+      PoolableConnectionFactory poolableConnectionFactory =
+              new PoolableConnectionFactory(connectionFactory, null);
+      GenericObjectPoolConfig poolConfig = new GenericObjectPoolConfig();
+      poolConfig.setMaxTotal(getPoolMaxTotal());
+      poolConfig.setBlockWhenExhausted(getPoolBlockWhenExhausted());
+      poolConfig.setMaxWaitMillis(getPoolMaxWaitMillis());
+      poolConfig.setMaxIdle(getPoolMaxIdle());
+      poolConfig.setMinIdle(getPoolMinIdle());
+      poolConfig.setTestOnBorrow(getPoolTestOnBorrow());
+      poolConfig.setTestOnReturn(getPoolTestOnReturn());
+      poolConfig.setNumTestsPerEvictionRun(getPoolNumTestsPerEvictionRun());
+      poolConfig.setMinEvictableIdleTimeMillis(getPoolMinEvictableIdleTimeMillis());
+      poolConfig.setTestWhileIdle(getPoolTestWhileIdle());
+      poolConfig.setSoftMinEvictableIdleTimeMillis(getPoolSoftMinEvictableIdleTimeMillis());
+      poolConfig.setLifo(getPoolLifo());
+      GenericObjectPool connectionPool =
+              new GenericObjectPool(poolableConnectionFactory, poolConfig);
+      poolableConnectionFactory.setPool(connectionPool);
+      poolableConnectionFactory.setValidationQuery("SELECT 1 FROM DUAL");
 
 Review comment:
   My bad, I forgot to update with the provided value. By default, the `validationQuery` should be null and the user can define it depending of his database.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 78788)
    Time Spent: 2h 20m  (was: 2h 10m)

> JdbcIO: Improve connection management
> -------------------------------------
>
>                 Key: BEAM-3500
>                 URL: https://issues.apache.org/jira/browse/BEAM-3500
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-jdbc
>    Affects Versions: 2.2.0
>            Reporter: Pawel Bartoszek
>            Assignee: Jean-Baptiste Onofré
>            Priority: Major
>          Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> JdbcIO write DoFn acquires connection in {{@Setup}} and release it in {{@Teardown}} methods, which means that connection might stay opened for days in streaming job case. Keeping single connection open for so long might be very risky as it's exposed to database, network etc issues.
> *Taking connection from the pool when it is actually needed*
> I suggest that connection would be taken from the connection pool in {{executeBatch}} method and released when the batch is flushed. This will allow the pool to take care of any returned unhealthy connections etc.
> *Make JdbcIO accept data source factory*
>  It would be nice if JdbcIO accepted DataSourceFactory rather than DataSource itself. I am saying that because sink checks if DataSource implements `Serializable` interface, which make it impossible to pass BasicDataSource(used internally by sink) as it doesn’t implement this interface. Something like:
> {code:java}
> interface DataSourceFactory extends Serializable{
>      DataSource create();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)