You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Ismaël Mejía (JIRA)" <ji...@apache.org> on 2019/04/06 13:52:00 UTC

[jira] [Updated] (BEAM-6670) Add `withOutputParallelization` option to disable reparallelization of JdbcIO.Read

     [ https://issues.apache.org/jira/browse/BEAM-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ismaël Mejía updated BEAM-6670:
-------------------------------
    Summary: Add `withOutputParallelization` option to disable reparallelization of JdbcIO.Read  (was: Option to disable reparallelization in JdbcIO.Read)

> Add `withOutputParallelization` option to disable reparallelization of JdbcIO.Read
> ----------------------------------------------------------------------------------
>
>                 Key: BEAM-6670
>                 URL: https://issues.apache.org/jira/browse/BEAM-6670
>             Project: Beam
>          Issue Type: Wish
>          Components: io-java-jdbc
>            Reporter: Mike Pedersen
>            Priority: Minor
>          Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> I'm doing approx. 20 JDBC queries against a database and then joining them together in a group by. Every single one of these queries does a reshuffle, which is sort of useless due to them being fed to a CoGroupByKey immediately afterwards.
> Reshuffle by default seems sensible by the principle of least surprise, but it would be nice to have a way to disable it when it's not necessary. For example a "withReshuffle(boolean)" method. 
> This should be an easy addition and I am willing to add this if it sounds reasonable enough.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)