You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "jiaan.geng (Jira)" <ji...@apache.org> on 2023/07/27 11:10:00 UTC
[jira] [Created] (SPARK-44571) Eliminate the Join by Combine multiple Aggregates
jiaan.geng created SPARK-44571:
----------------------------------
Summary: Eliminate the Join by Combine multiple Aggregates
Key: SPARK-44571
URL: https://issues.apache.org/jira/browse/SPARK-44571
Project: Spark
Issue Type: New Feature
Components: SQL
Affects Versions: 3.5.0
Reporter: jiaan.geng
Recently, I investigate the test case q28 which is belong to the TPC-DS queries.
The query contains multiple scalar subquery with aggregation and connected with inner join.
If we can merge the filters and aggregates, we can scan data source only once and eliminate the join so as avoid shuffle. Obviously, this change will improve the performance.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org