You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Beam JIRA Bot (Jira)" <ji...@apache.org> on 2020/08/01 17:07:04 UTC

[jira] [Commented] (BEAM-9748) Refactor Reparallelize as an alternative Reshuffle implementation

    [ https://issues.apache.org/jira/browse/BEAM-9748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17169327#comment-17169327 ] 

Beam JIRA Bot commented on BEAM-9748:
-------------------------------------

This issue was marked "stale-assigned" and has not received a public comment in 7 days. It is now automatically unassigned. If you are still working on it, you can assign it to yourself again. Please also give an update about the status of the work.

> Refactor Reparallelize as an alternative Reshuffle implementation
> -----------------------------------------------------------------
>
>                 Key: BEAM-9748
>                 URL: https://issues.apache.org/jira/browse/BEAM-9748
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-core
>            Reporter: Ismaël Mejía
>            Priority: P3
>          Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Some DoFn based IOs like JdbcIO and RedisIO rely on a different approach to Reparallelize outputs using a combination of a an empty PCollectionView to force materialization and Reshuffle.viaRandomkey to reparallelize a PCollection. This issue extracts this transform and expose it as part of the Reshuffle to avoid repeating the code for transforms (notably IOs) that produce lots of sequentially generated data where and benefit of this alternative approach to perform better reparallelization of its output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)