You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Pablo Estrada (Jira)" <ji...@apache.org> on 2022/03/23 20:44:00 UTC

[jira] [Created] (BEAM-14161) Add dynamic splitting to JdbcIO.readWithPartitions

Pablo Estrada created BEAM-14161:
------------------------------------

             Summary: Add dynamic splitting to JdbcIO.readWithPartitions
                 Key: BEAM-14161
                 URL: https://issues.apache.org/jira/browse/BEAM-14161
             Project: Beam
          Issue Type: Improvement
          Components: io-java-jdbc
            Reporter: Pablo Estrada
            Assignee: Jean-Baptiste Onofré
             Fix For: Not applicable


Now, the JDBC IO is basically a {{DoFn}} executed with a {{ParDo}}. So, it means that parallelism is "limited" and executed on one executor.
We can imagine to create several JDBC {{BoundedSource}}s splitting the SQL query in  subset (for instance using row id paging or any "splitting/limit" we can figure based on the original SQL query) (something similar to what Sqoop is doing).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)