You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "L. C. Hsieh (Jira)" <ji...@apache.org> on 2020/06/17 01:55:00 UTC

[jira] [Created] (SPARK-32012) Incrementally create and materialize query stage to avoid unnecessary local shuffle

L. C. Hsieh created SPARK-32012:
-----------------------------------

             Summary: Incrementally create and materialize query stage to avoid unnecessary local shuffle
                 Key: SPARK-32012
                 URL: https://issues.apache.org/jira/browse/SPARK-32012
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 3.1.0
            Reporter: L. C. Hsieh
            Assignee: L. C. Hsieh


The current way of creating query stage in AQE is in batch. For example, the children of a sort merge join will be materialized as query stages in a batch. Then AQE brings the optimization in and optimize sort merge join to broadcast join. Except for the broadcasted exchange, we don't need do any exchange on another side of join but we already materialized the exchange. Currently AQE wraps the materialized exchange with local reader, but it still brings unnecessary I/O. We can avoid unnecessary local shuffle by incrementally creating query stage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org