You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Brian Hulette (Jira)" <ji...@apache.org> on 2022/01/05 01:00:32 UTC

[jira] [Comment Edited] (BEAM-5049) Multiple batch joins on the same key results in two shuffles

    [ https://issues.apache.org/jira/browse/BEAM-5049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17468927#comment-17468927 ] 

Brian Hulette edited comment on BEAM-5049 at 1/5/22, 1:00 AM:
--------------------------------------------------------------

Adding the starter label to this since its a nice well-defined task. It's probably tricky to implement though as it will likely require digging into the SQL optimizer and adding a new rule.

Do you have any advice here [~ibzib]? 


was (Author: bhulette):
Adding the starter label to this since its a nice well-defined task. It's probably tricky to implement though as it will require digging into the SQL optimizer.

> Multiple batch joins on the same key results in two shuffles
> ------------------------------------------------------------
>
>                 Key: BEAM-5049
>                 URL: https://issues.apache.org/jira/browse/BEAM-5049
>             Project: Beam
>          Issue Type: Improvement
>          Components: dsl-sql
>            Reporter: Anton Kedin
>            Priority: P2
>              Labels: starter
>
> The query like this:
> {code}
> SELECT a.*, b.*, c.* FROM a JOIN b ON a.some_id = b.some_id JOIN c ON a.some_id = c.some_id;
> {code}
> results in two shuffles. Can probably be optimized.
> Relevant code:
>  - BeamJoinRel implements Join in SQL: https://github.com/apache/beam/blob/1675b0f843ed34de8ba6f3676f794db80b40139d/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamJoinRel.java#L194
> - CoGBK Join implementation: https://github.com/apache/beam/blob/279a05604b83a54e8e5a79e13d8761f94841f326/sdks/java/extensions/join-library/src/main/java/org/apache/beam/sdk/extensions/joinlibrary/Join.java#L36



--
This message was sent by Atlassian Jira
(v8.20.1#820001)