You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Pablo Estrada (Jira)" <ji...@apache.org> on 2022/03/23 20:44:00 UTC
[jira] [Created] (BEAM-14161) Add dynamic splitting to JdbcIO.readWithPartitions
Pablo Estrada created BEAM-14161:
------------------------------------
Summary: Add dynamic splitting to JdbcIO.readWithPartitions
Key: BEAM-14161
URL: https://issues.apache.org/jira/browse/BEAM-14161
Project: Beam
Issue Type: Improvement
Components: io-java-jdbc
Reporter: Pablo Estrada
Assignee: Jean-Baptiste Onofré
Fix For: Not applicable
Now, the JDBC IO is basically a {{DoFn}} executed with a {{ParDo}}. So, it means that parallelism is "limited" and executed on one executor.
We can imagine to create several JDBC {{BoundedSource}}s splitting the SQL query in subset (for instance using row id paging or any "splitting/limit" we can figure based on the original SQL query) (something similar to what Sqoop is doing).
--
This message was sent by Atlassian Jira
(v8.20.1#820001)