You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Amit Sela (JIRA)" <ji...@apache.org> on 2017/03/27 13:28:41 UTC

[jira] [Created] (BEAM-1815) Avoid shuffling twice in GABW

Amit Sela created BEAM-1815:
-------------------------------

             Summary: Avoid shuffling twice in GABW
                 Key: BEAM-1815
                 URL: https://issues.apache.org/jira/browse/BEAM-1815
             Project: Beam
          Issue Type: Bug
          Components: runner-spark
            Reporter: Amit Sela
            Assignee: Amit Sela


Spark runner implementation of GABW includes a "built-in" groupByKey, but BOBK before it already groups, so in order to avoid an unnecessary shuffle we need to force a {{Partitioner}} on the RDDs involved. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)