You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Robert Burke (Jira)" <ji...@apache.org> on 2021/10/08 22:01:00 UTC

[jira] [Commented] (BEAM-12999) Improve Reshuffle Transform

    [ https://issues.apache.org/jira/browse/BEAM-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17426397#comment-17426397 ] 

Robert Burke commented on BEAM-12999:
-------------------------------------

How does one "reshuffle" using the user Key? That is, rather than using a random key as currenty implemented? 

Much of the time these are implemented WRT composite implementations with a Group By Key under the hood, so trying to keep keys together defeats the purpose of a reshuffle AFAICT.

> Improve Reshuffle Transform
> ---------------------------
>
>                 Key: BEAM-12999
>                 URL: https://issues.apache.org/jira/browse/BEAM-12999
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-go, sdk-java-core, sdk-py-core
>            Reporter: Ke Wu
>            Assignee: Ke Wu
>            Priority: P2
>          Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> See discussion [https://lists.apache.org/thread.html/r83adaad3a512ad186f2f9dc9dc4bec2a789070677c07cdcaad6fcfa5%40%3Cdev.beam.apache.org%3E] 
>  
> “beam:transform:reshuffle:v1" Transform represents different semantic transforms in different SDKs. The proposal is to replace "beam:transform:reshuffle:v1" with two new urns, one to represent reshuffle KV PCollection using the K, and the other to reshuffle based on random key.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)