You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Pablo Estrada (Jira)" <ji...@apache.org> on 2022/03/23 20:46:00 UTC

[jira] [Updated] (BEAM-14161) Add dynamic splitting to JdbcIO.readWithPartitions

     [ https://issues.apache.org/jira/browse/BEAM-14161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pablo Estrada updated BEAM-14161:
---------------------------------
    Status: Open  (was: Triage Needed)

> Add dynamic splitting to JdbcIO.readWithPartitions
> --------------------------------------------------
>
>                 Key: BEAM-14161
>                 URL: https://issues.apache.org/jira/browse/BEAM-14161
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-jdbc
>            Reporter: Pablo Estrada
>            Assignee: Jean-Baptiste Onofré
>            Priority: P2
>             Fix For: Not applicable
>
>
> Now, the JDBC IO is basically a {{DoFn}} executed with a {{{}ParDo{}}}. So, it means that parallelism is "limited" and executed on one executor. ReadWithPartitions does some preliminary partitioning of the data, but any skew in data range or workload will create an unbalanced workload.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)