You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by yu...@apache.org on 2021/08/19 08:45:59 UTC
[spark] branch branch-3.2 updated: [SPARK-36444][SQL] Remove
OptimizeSubqueries from batch of PartitionPruning
This is an automated email from the ASF dual-hosted git repository.
yumwang pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.2 by this push:
new 5b971650 [SPARK-36444][SQL] Remove OptimizeSubqueries from batch of PartitionPruning
5b971650 is described below
commit 5b971650bdf67b5c7ed562998ec45b44312060ed
Author: Yuming Wang <yu...@ebay.com>
AuthorDate: Thu Aug 19 16:44:50 2021 +0800
[SPARK-36444][SQL] Remove OptimizeSubqueries from batch of PartitionPruning
### What changes were proposed in this pull request?
Remove `OptimizeSubqueries` from batch of `PartitionPruning` to make DPP support more cases. For example:
```sql
SELECT date_id, product_id FROM fact_sk f
JOIN (select store_id + 3 as new_store_id from dim_store where country = 'US') s
ON f.store_id = s.new_store_id
```
Before this PR:
```
== Physical Plan ==
*(2) Project [date_id#3998, product_id#3999]
+- *(2) BroadcastHashJoin [store_id#4001], [new_store_id#3997], Inner, BuildRight, false
:- *(2) ColumnarToRow
: +- FileScan parquet default.fact_sk[date_id#3998,product_id#3999,store_id#4001] Batched: true, DataFilters: [], Format: Parquet, PartitionFilters: [isnotnull(store_id#4001), dynamicpruningexpression(true)], PushedFilters: [], ReadSchema: struct<date_id:int,product_id:int>
+- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#274]
+- *(1) Project [(store_id#4002 + 3) AS new_store_id#3997]
+- *(1) Filter ((isnotnull(country#4004) AND (country#4004 = US)) AND isnotnull((store_id#4002 + 3)))
+- *(1) ColumnarToRow
+- FileScan parquet default.dim_store[store_id#4002,country#4004] Batched: true, DataFilters: [isnotnull(country#4004), (country#4004 = US), isnotnull((store_id#4002 + 3))], Format: Parquet, PartitionFilters: [], PushedFilters: [IsNotNull(country), EqualTo(country,US)], ReadSchema: struct<store_id:int,country:string>
```
After this PR:
```
== Physical Plan ==
*(2) Project [date_id#3998, product_id#3999]
+- *(2) BroadcastHashJoin [store_id#4001], [new_store_id#3997], Inner, BuildRight, false
:- *(2) ColumnarToRow
: +- FileScan parquet default.fact_sk[date_id#3998,product_id#3999,store_id#4001] Batched: true, DataFilters: [], Format: Parquet, PartitionFilters: [isnotnull(store_id#4001), dynamicpruningexpression(store_id#4001 IN dynamicpruning#4007)], PushedFilters: [], ReadSchema: struct<date_id:int,product_id:int>
: +- SubqueryBroadcast dynamicpruning#4007, 0, [new_store_id#3997], [id=#263]
: +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#262]
: +- *(1) Project [(store_id#4002 + 3) AS new_store_id#3997]
: +- *(1) Filter ((isnotnull(country#4004) AND (country#4004 = US)) AND isnotnull((store_id#4002 + 3)))
: +- *(1) ColumnarToRow
: +- FileScan parquet default.dim_store[store_id#4002,country#4004] Batched: true, DataFilters: [isnotnull(country#4004), (country#4004 = US), isnotnull((store_id#4002 + 3))], Format: Parquet, PartitionFilters: [], PushedFilters: [IsNotNull(country), EqualTo(country,US)], ReadSchema: struct<store_id:int,country:string>
+- ReusedExchange [new_store_id#3997], BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#262]
```
This is because `OptimizeSubqueries` will infer more filters, so we cannot reuse broadcasts. The following is the plan if disable `spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly`:
```
== Physical Plan ==
*(2) Project [date_id#3998, product_id#3999]
+- *(2) BroadcastHashJoin [store_id#4001], [new_store_id#3997], Inner, BuildRight, false
:- *(2) ColumnarToRow
: +- FileScan parquet default.fact_sk[date_id#3998,product_id#3999,store_id#4001] Batched: true, DataFilters: [], Format: Parquet, PartitionFilters: [isnotnull(store_id#4001), dynamicpruningexpression(store_id#4001 IN subquery#4009)], PushedFilters: [], ReadSchema: struct<date_id:int,product_id:int>
: +- Subquery subquery#4009, [id=#284]
: +- *(2) HashAggregate(keys=[new_store_id#3997#4008], functions=[])
: +- Exchange hashpartitioning(new_store_id#3997#4008, 5), ENSURE_REQUIREMENTS, [id=#280]
: +- *(1) HashAggregate(keys=[new_store_id#3997 AS new_store_id#3997#4008], functions=[])
: +- *(1) Project [(store_id#4002 + 3) AS new_store_id#3997]
: +- *(1) Filter (((isnotnull(store_id#4002) AND isnotnull(country#4004)) AND (country#4004 = US)) AND isnotnull((store_id#4002 + 3)))
: +- *(1) ColumnarToRow
: +- FileScan parquet default.dim_store[store_id#4002,country#4004] Batched: true, DataFilters: [isnotnull(store_id#4002), isnotnull(country#4004), (country#4004 = US), isnotnull((store_id#4002..., Format: Parquet, PartitionFilters: [], PushedFilters: [IsNotNull(store_id), IsNotNull(country), EqualTo(country,US)], ReadSchema: struct<store_id:int,country:string>
+- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#305]
+- *(1) Project [(store_id#4002 + 3) AS new_store_id#3997]
+- *(1) Filter ((isnotnull(country#4004) AND (country#4004 = US)) AND isnotnull((store_id#4002 + 3)))
+- *(1) ColumnarToRow
+- FileScan parquet default.dim_store[store_id#4002,country#4004] Batched: true, DataFilters: [isnotnull(country#4004), (country#4004 = US), isnotnull((store_id#4002 + 3))], Format: Parquet, PartitionFilters: [], PushedFilters: [IsNotNull(country), EqualTo(country,US)], ReadSchema: struct<store_id:int,country:string>
```
### Why are the changes needed?
Improve DPP to support more cases.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Unit test and benchmark test:
SQL | Before this PR(Seconds) | After this PR(Seconds)
-- | -- | --
TPC-DS q58 | 40 | 20
TPC-DS q83 | 18 | 14
Closes #33664 from wangyum/SPARK-36444.
Authored-by: Yuming Wang <yu...@ebay.com>
Signed-off-by: Yuming Wang <yu...@ebay.com>
(cherry picked from commit 2310b99e146b9856766611a2b4359f0fbee2dd44)
Signed-off-by: Yuming Wang <yu...@ebay.com>
---
.../spark/sql/execution/SparkOptimizer.scala | 3 +-
.../approved-plans-v1_4/q58.sf100/explain.txt | 567 +++++++++++----------
.../approved-plans-v1_4/q58.sf100/simplified.txt | 44 +-
.../approved-plans-v1_4/q58/explain.txt | 545 ++++++++++----------
.../approved-plans-v1_4/q58/simplified.txt | 60 ++-
.../approved-plans-v1_4/q83.sf100/explain.txt | 542 ++++++++++----------
.../approved-plans-v1_4/q83.sf100/simplified.txt | 50 +-
.../approved-plans-v1_4/q83/explain.txt | 504 +++++++++---------
.../approved-plans-v1_4/q83/simplified.txt | 62 +--
.../spark/sql/DynamicPartitionPruningSuite.scala | 14 +
10 files changed, 1235 insertions(+), 1156 deletions(-)
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkOptimizer.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkOptimizer.scala
index dde5dc2..dc3ceb5 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkOptimizer.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkOptimizer.scala
@@ -42,8 +42,7 @@ class SparkOptimizer(
override def defaultBatches: Seq[Batch] = (preOptimizationBatches ++ super.defaultBatches :+
Batch("Optimize Metadata Only Query", Once, OptimizeMetadataOnlyQuery(catalog)) :+
Batch("PartitionPruning", Once,
- PartitionPruning,
- OptimizeSubqueries) :+
+ PartitionPruning) :+
Batch("Pushdown Filters from PartitionPruning", fixedPoint,
PushDownPredicates) :+
Batch("Cleanup filters that cannot be pushed down", Once,
diff --git a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q58.sf100/explain.txt b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q58.sf100/explain.txt
index d6b61f6..8e96909 100644
--- a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q58.sf100/explain.txt
+++ b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q58.sf100/explain.txt
@@ -1,70 +1,60 @@
== Physical Plan ==
-TakeOrderedAndProject (59)
-+- * Project (58)
- +- * BroadcastHashJoin Inner BuildRight (57)
- :- * Project (42)
- : +- * BroadcastHashJoin Inner BuildRight (41)
- : :- * Filter (26)
- : : +- * HashAggregate (25)
- : : +- Exchange (24)
- : : +- * HashAggregate (23)
- : : +- * Project (22)
- : : +- * BroadcastHashJoin Inner BuildRight (21)
- : : :- * Project (16)
- : : : +- * BroadcastHashJoin Inner BuildRight (15)
+TakeOrderedAndProject (49)
++- * Project (48)
+ +- * BroadcastHashJoin Inner BuildRight (47)
+ :- * Project (32)
+ : +- * BroadcastHashJoin Inner BuildRight (31)
+ : :- * Filter (16)
+ : : +- * HashAggregate (15)
+ : : +- Exchange (14)
+ : : +- * HashAggregate (13)
+ : : +- * Project (12)
+ : : +- * BroadcastHashJoin Inner BuildRight (11)
+ : : :- * Project (6)
+ : : : +- * BroadcastHashJoin Inner BuildRight (5)
: : : :- * Filter (3)
: : : : +- * ColumnarToRow (2)
: : : : +- Scan parquet default.store_sales (1)
- : : : +- BroadcastExchange (14)
- : : : +- * Project (13)
- : : : +- * BroadcastHashJoin LeftSemi BuildRight (12)
- : : : :- * Filter (6)
- : : : : +- * ColumnarToRow (5)
- : : : : +- Scan parquet default.date_dim (4)
- : : : +- BroadcastExchange (11)
- : : : +- * Project (10)
- : : : +- * Filter (9)
- : : : +- * ColumnarToRow (8)
- : : : +- Scan parquet default.date_dim (7)
- : : +- BroadcastExchange (20)
- : : +- * Filter (19)
- : : +- * ColumnarToRow (18)
- : : +- Scan parquet default.item (17)
- : +- BroadcastExchange (40)
- : +- * Filter (39)
- : +- * HashAggregate (38)
- : +- Exchange (37)
- : +- * HashAggregate (36)
- : +- * Project (35)
- : +- * BroadcastHashJoin Inner BuildRight (34)
- : :- * Project (32)
- : : +- * BroadcastHashJoin Inner BuildRight (31)
- : : :- * Filter (29)
- : : : +- * ColumnarToRow (28)
- : : : +- Scan parquet default.catalog_sales (27)
- : : +- ReusedExchange (30)
- : +- ReusedExchange (33)
- +- BroadcastExchange (56)
- +- * Filter (55)
- +- * HashAggregate (54)
- +- Exchange (53)
- +- * HashAggregate (52)
- +- * Project (51)
- +- * BroadcastHashJoin Inner BuildRight (50)
- :- * Project (48)
- : +- * BroadcastHashJoin Inner BuildRight (47)
- : :- * Filter (45)
- : : +- * ColumnarToRow (44)
- : : +- Scan parquet default.web_sales (43)
- : +- ReusedExchange (46)
- +- ReusedExchange (49)
+ : : : +- ReusedExchange (4)
+ : : +- BroadcastExchange (10)
+ : : +- * Filter (9)
+ : : +- * ColumnarToRow (8)
+ : : +- Scan parquet default.item (7)
+ : +- BroadcastExchange (30)
+ : +- * Filter (29)
+ : +- * HashAggregate (28)
+ : +- Exchange (27)
+ : +- * HashAggregate (26)
+ : +- * Project (25)
+ : +- * BroadcastHashJoin Inner BuildRight (24)
+ : :- * Project (22)
+ : : +- * BroadcastHashJoin Inner BuildRight (21)
+ : : :- * Filter (19)
+ : : : +- * ColumnarToRow (18)
+ : : : +- Scan parquet default.catalog_sales (17)
+ : : +- ReusedExchange (20)
+ : +- ReusedExchange (23)
+ +- BroadcastExchange (46)
+ +- * Filter (45)
+ +- * HashAggregate (44)
+ +- Exchange (43)
+ +- * HashAggregate (42)
+ +- * Project (41)
+ +- * BroadcastHashJoin Inner BuildRight (40)
+ :- * Project (38)
+ : +- * BroadcastHashJoin Inner BuildRight (37)
+ : :- * Filter (35)
+ : : +- * ColumnarToRow (34)
+ : : +- Scan parquet default.web_sales (33)
+ : +- ReusedExchange (36)
+ +- ReusedExchange (39)
(1) Scan parquet default.store_sales
Output [3]: [ss_item_sk#1, ss_ext_sales_price#2, ss_sold_date_sk#3]
Batched: true
Location: InMemoryFileIndex []
-PartitionFilters: [isnotnull(ss_sold_date_sk#3), dynamicpruningexpression(true)]
+PartitionFilters: [isnotnull(ss_sold_date_sk#3), dynamicpruningexpression(ss_sold_date_sk#3 IN dynamicpruning#4)]
PushedFilters: [IsNotNull(ss_item_sk)]
ReadSchema: struct<ss_item_sk:int,ss_ext_sales_price:decimal(7,2)>
@@ -75,290 +65,311 @@ Input [3]: [ss_item_sk#1, ss_ext_sales_price#2, ss_sold_date_sk#3]
Input [3]: [ss_item_sk#1, ss_ext_sales_price#2, ss_sold_date_sk#3]
Condition : isnotnull(ss_item_sk#1)
-(4) Scan parquet default.date_dim
-Output [2]: [d_date_sk#4, d_date#5]
-Batched: true
-Location [not included in comparison]/{warehouse_dir}/date_dim]
-PushedFilters: [IsNotNull(d_date_sk)]
-ReadSchema: struct<d_date_sk:int,d_date:date>
-
-(5) ColumnarToRow [codegen id : 2]
-Input [2]: [d_date_sk#4, d_date#5]
-
-(6) Filter [codegen id : 2]
-Input [2]: [d_date_sk#4, d_date#5]
-Condition : isnotnull(d_date_sk#4)
-
-(7) Scan parquet default.date_dim
-Output [2]: [d_date#6, d_week_seq#7]
-Batched: true
-Location [not included in comparison]/{warehouse_dir}/date_dim]
-PushedFilters: [IsNotNull(d_week_seq)]
-ReadSchema: struct<d_date:date,d_week_seq:int>
+(4) ReusedExchange [Reuses operator id: 60]
+Output [1]: [d_date_sk#5]
-(8) ColumnarToRow [codegen id : 1]
-Input [2]: [d_date#6, d_week_seq#7]
-
-(9) Filter [codegen id : 1]
-Input [2]: [d_date#6, d_week_seq#7]
-Condition : (isnotnull(d_week_seq#7) AND (d_week_seq#7 = Subquery scalar-subquery#8, [id=#9]))
-
-(10) Project [codegen id : 1]
-Output [1]: [d_date#6]
-Input [2]: [d_date#6, d_week_seq#7]
-
-(11) BroadcastExchange
-Input [1]: [d_date#6]
-Arguments: HashedRelationBroadcastMode(List(input[0, date, true]),false), [id=#10]
-
-(12) BroadcastHashJoin [codegen id : 2]
-Left keys [1]: [d_date#5]
-Right keys [1]: [d_date#6]
-Join condition: None
-
-(13) Project [codegen id : 2]
-Output [1]: [d_date_sk#4]
-Input [2]: [d_date_sk#4, d_date#5]
-
-(14) BroadcastExchange
-Input [1]: [d_date_sk#4]
-Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#11]
-
-(15) BroadcastHashJoin [codegen id : 4]
+(5) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_sold_date_sk#3]
-Right keys [1]: [d_date_sk#4]
+Right keys [1]: [d_date_sk#5]
Join condition: None
-(16) Project [codegen id : 4]
+(6) Project [codegen id : 4]
Output [2]: [ss_item_sk#1, ss_ext_sales_price#2]
-Input [4]: [ss_item_sk#1, ss_ext_sales_price#2, ss_sold_date_sk#3, d_date_sk#4]
+Input [4]: [ss_item_sk#1, ss_ext_sales_price#2, ss_sold_date_sk#3, d_date_sk#5]
-(17) Scan parquet default.item
-Output [2]: [i_item_sk#12, i_item_id#13]
+(7) Scan parquet default.item
+Output [2]: [i_item_sk#6, i_item_id#7]
Batched: true
Location [not included in comparison]/{warehouse_dir}/item]
PushedFilters: [IsNotNull(i_item_sk), IsNotNull(i_item_id)]
ReadSchema: struct<i_item_sk:int,i_item_id:string>
-(18) ColumnarToRow [codegen id : 3]
-Input [2]: [i_item_sk#12, i_item_id#13]
+(8) ColumnarToRow [codegen id : 3]
+Input [2]: [i_item_sk#6, i_item_id#7]
-(19) Filter [codegen id : 3]
-Input [2]: [i_item_sk#12, i_item_id#13]
-Condition : (isnotnull(i_item_sk#12) AND isnotnull(i_item_id#13))
+(9) Filter [codegen id : 3]
+Input [2]: [i_item_sk#6, i_item_id#7]
+Condition : (isnotnull(i_item_sk#6) AND isnotnull(i_item_id#7))
-(20) BroadcastExchange
-Input [2]: [i_item_sk#12, i_item_id#13]
-Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#14]
+(10) BroadcastExchange
+Input [2]: [i_item_sk#6, i_item_id#7]
+Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#8]
-(21) BroadcastHashJoin [codegen id : 4]
+(11) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_item_sk#1]
-Right keys [1]: [i_item_sk#12]
+Right keys [1]: [i_item_sk#6]
Join condition: None
-(22) Project [codegen id : 4]
-Output [2]: [ss_ext_sales_price#2, i_item_id#13]
-Input [4]: [ss_item_sk#1, ss_ext_sales_price#2, i_item_sk#12, i_item_id#13]
+(12) Project [codegen id : 4]
+Output [2]: [ss_ext_sales_price#2, i_item_id#7]
+Input [4]: [ss_item_sk#1, ss_ext_sales_price#2, i_item_sk#6, i_item_id#7]
-(23) HashAggregate [codegen id : 4]
-Input [2]: [ss_ext_sales_price#2, i_item_id#13]
-Keys [1]: [i_item_id#13]
+(13) HashAggregate [codegen id : 4]
+Input [2]: [ss_ext_sales_price#2, i_item_id#7]
+Keys [1]: [i_item_id#7]
Functions [1]: [partial_sum(UnscaledValue(ss_ext_sales_price#2))]
-Aggregate Attributes [1]: [sum#15]
-Results [2]: [i_item_id#13, sum#16]
+Aggregate Attributes [1]: [sum#9]
+Results [2]: [i_item_id#7, sum#10]
-(24) Exchange
-Input [2]: [i_item_id#13, sum#16]
-Arguments: hashpartitioning(i_item_id#13, 5), ENSURE_REQUIREMENTS, [id=#17]
+(14) Exchange
+Input [2]: [i_item_id#7, sum#10]
+Arguments: hashpartitioning(i_item_id#7, 5), ENSURE_REQUIREMENTS, [id=#11]
-(25) HashAggregate [codegen id : 15]
-Input [2]: [i_item_id#13, sum#16]
-Keys [1]: [i_item_id#13]
+(15) HashAggregate [codegen id : 15]
+Input [2]: [i_item_id#7, sum#10]
+Keys [1]: [i_item_id#7]
Functions [1]: [sum(UnscaledValue(ss_ext_sales_price#2))]
-Aggregate Attributes [1]: [sum(UnscaledValue(ss_ext_sales_price#2))#18]
-Results [2]: [i_item_id#13 AS item_id#19, MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#2))#18,17,2) AS ss_item_rev#20]
+Aggregate Attributes [1]: [sum(UnscaledValue(ss_ext_sales_price#2))#12]
+Results [2]: [i_item_id#7 AS item_id#13, MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#2))#12,17,2) AS ss_item_rev#14]
-(26) Filter [codegen id : 15]
-Input [2]: [item_id#19, ss_item_rev#20]
-Condition : isnotnull(ss_item_rev#20)
+(16) Filter [codegen id : 15]
+Input [2]: [item_id#13, ss_item_rev#14]
+Condition : isnotnull(ss_item_rev#14)
-(27) Scan parquet default.catalog_sales
-Output [3]: [cs_item_sk#21, cs_ext_sales_price#22, cs_sold_date_sk#23]
+(17) Scan parquet default.catalog_sales
+Output [3]: [cs_item_sk#15, cs_ext_sales_price#16, cs_sold_date_sk#17]
Batched: true
Location: InMemoryFileIndex []
-PartitionFilters: [isnotnull(cs_sold_date_sk#23), dynamicpruningexpression(true)]
+PartitionFilters: [isnotnull(cs_sold_date_sk#17), dynamicpruningexpression(cs_sold_date_sk#17 IN dynamicpruning#4)]
PushedFilters: [IsNotNull(cs_item_sk)]
ReadSchema: struct<cs_item_sk:int,cs_ext_sales_price:decimal(7,2)>
-(28) ColumnarToRow [codegen id : 8]
-Input [3]: [cs_item_sk#21, cs_ext_sales_price#22, cs_sold_date_sk#23]
+(18) ColumnarToRow [codegen id : 8]
+Input [3]: [cs_item_sk#15, cs_ext_sales_price#16, cs_sold_date_sk#17]
-(29) Filter [codegen id : 8]
-Input [3]: [cs_item_sk#21, cs_ext_sales_price#22, cs_sold_date_sk#23]
-Condition : isnotnull(cs_item_sk#21)
+(19) Filter [codegen id : 8]
+Input [3]: [cs_item_sk#15, cs_ext_sales_price#16, cs_sold_date_sk#17]
+Condition : isnotnull(cs_item_sk#15)
-(30) ReusedExchange [Reuses operator id: 14]
-Output [1]: [d_date_sk#24]
+(20) ReusedExchange [Reuses operator id: 60]
+Output [1]: [d_date_sk#18]
-(31) BroadcastHashJoin [codegen id : 8]
-Left keys [1]: [cs_sold_date_sk#23]
-Right keys [1]: [d_date_sk#24]
+(21) BroadcastHashJoin [codegen id : 8]
+Left keys [1]: [cs_sold_date_sk#17]
+Right keys [1]: [d_date_sk#18]
Join condition: None
-(32) Project [codegen id : 8]
-Output [2]: [cs_item_sk#21, cs_ext_sales_price#22]
-Input [4]: [cs_item_sk#21, cs_ext_sales_price#22, cs_sold_date_sk#23, d_date_sk#24]
+(22) Project [codegen id : 8]
+Output [2]: [cs_item_sk#15, cs_ext_sales_price#16]
+Input [4]: [cs_item_sk#15, cs_ext_sales_price#16, cs_sold_date_sk#17, d_date_sk#18]
-(33) ReusedExchange [Reuses operator id: 20]
-Output [2]: [i_item_sk#25, i_item_id#26]
+(23) ReusedExchange [Reuses operator id: 10]
+Output [2]: [i_item_sk#19, i_item_id#20]
-(34) BroadcastHashJoin [codegen id : 8]
-Left keys [1]: [cs_item_sk#21]
-Right keys [1]: [i_item_sk#25]
+(24) BroadcastHashJoin [codegen id : 8]
+Left keys [1]: [cs_item_sk#15]
+Right keys [1]: [i_item_sk#19]
Join condition: None
-(35) Project [codegen id : 8]
-Output [2]: [cs_ext_sales_price#22, i_item_id#26]
-Input [4]: [cs_item_sk#21, cs_ext_sales_price#22, i_item_sk#25, i_item_id#26]
-
-(36) HashAggregate [codegen id : 8]
-Input [2]: [cs_ext_sales_price#22, i_item_id#26]
-Keys [1]: [i_item_id#26]
-Functions [1]: [partial_sum(UnscaledValue(cs_ext_sales_price#22))]
-Aggregate Attributes [1]: [sum#27]
-Results [2]: [i_item_id#26, sum#28]
-
-(37) Exchange
-Input [2]: [i_item_id#26, sum#28]
-Arguments: hashpartitioning(i_item_id#26, 5), ENSURE_REQUIREMENTS, [id=#29]
-
-(38) HashAggregate [codegen id : 9]
-Input [2]: [i_item_id#26, sum#28]
-Keys [1]: [i_item_id#26]
-Functions [1]: [sum(UnscaledValue(cs_ext_sales_price#22))]
-Aggregate Attributes [1]: [sum(UnscaledValue(cs_ext_sales_price#22))#30]
-Results [2]: [i_item_id#26 AS item_id#31, MakeDecimal(sum(UnscaledValue(cs_ext_sales_price#22))#30,17,2) AS cs_item_rev#32]
-
-(39) Filter [codegen id : 9]
-Input [2]: [item_id#31, cs_item_rev#32]
-Condition : isnotnull(cs_item_rev#32)
-
-(40) BroadcastExchange
-Input [2]: [item_id#31, cs_item_rev#32]
-Arguments: HashedRelationBroadcastMode(List(input[0, string, true]),false), [id=#33]
-
-(41) BroadcastHashJoin [codegen id : 15]
-Left keys [1]: [item_id#19]
-Right keys [1]: [item_id#31]
-Join condition: ((((cast(ss_item_rev#20 as decimal(19,3)) >= CheckOverflow((0.90 * promote_precision(cs_item_rev#32)), DecimalType(19,3), true)) AND (cast(ss_item_rev#20 as decimal(20,3)) <= CheckOverflow((1.10 * promote_precision(cs_item_rev#32)), DecimalType(20,3), true))) AND (cast(cs_item_rev#32 as decimal(19,3)) >= CheckOverflow((0.90 * promote_precision(ss_item_rev#20)), DecimalType(19,3), true))) AND (cast(cs_item_rev#32 as decimal(20,3)) <= CheckOverflow((1.10 * promote_precision [...]
-
-(42) Project [codegen id : 15]
-Output [3]: [item_id#19, ss_item_rev#20, cs_item_rev#32]
-Input [4]: [item_id#19, ss_item_rev#20, item_id#31, cs_item_rev#32]
-
-(43) Scan parquet default.web_sales
-Output [3]: [ws_item_sk#34, ws_ext_sales_price#35, ws_sold_date_sk#36]
+(25) Project [codegen id : 8]
+Output [2]: [cs_ext_sales_price#16, i_item_id#20]
+Input [4]: [cs_item_sk#15, cs_ext_sales_price#16, i_item_sk#19, i_item_id#20]
+
+(26) HashAggregate [codegen id : 8]
+Input [2]: [cs_ext_sales_price#16, i_item_id#20]
+Keys [1]: [i_item_id#20]
+Functions [1]: [partial_sum(UnscaledValue(cs_ext_sales_price#16))]
+Aggregate Attributes [1]: [sum#21]
+Results [2]: [i_item_id#20, sum#22]
+
+(27) Exchange
+Input [2]: [i_item_id#20, sum#22]
+Arguments: hashpartitioning(i_item_id#20, 5), ENSURE_REQUIREMENTS, [id=#23]
+
+(28) HashAggregate [codegen id : 9]
+Input [2]: [i_item_id#20, sum#22]
+Keys [1]: [i_item_id#20]
+Functions [1]: [sum(UnscaledValue(cs_ext_sales_price#16))]
+Aggregate Attributes [1]: [sum(UnscaledValue(cs_ext_sales_price#16))#24]
+Results [2]: [i_item_id#20 AS item_id#25, MakeDecimal(sum(UnscaledValue(cs_ext_sales_price#16))#24,17,2) AS cs_item_rev#26]
+
+(29) Filter [codegen id : 9]
+Input [2]: [item_id#25, cs_item_rev#26]
+Condition : isnotnull(cs_item_rev#26)
+
+(30) BroadcastExchange
+Input [2]: [item_id#25, cs_item_rev#26]
+Arguments: HashedRelationBroadcastMode(List(input[0, string, true]),false), [id=#27]
+
+(31) BroadcastHashJoin [codegen id : 15]
+Left keys [1]: [item_id#13]
+Right keys [1]: [item_id#25]
+Join condition: ((((cast(ss_item_rev#14 as decimal(19,3)) >= CheckOverflow((0.90 * promote_precision(cs_item_rev#26)), DecimalType(19,3), true)) AND (cast(ss_item_rev#14 as decimal(20,3)) <= CheckOverflow((1.10 * promote_precision(cs_item_rev#26)), DecimalType(20,3), true))) AND (cast(cs_item_rev#26 as decimal(19,3)) >= CheckOverflow((0.90 * promote_precision(ss_item_rev#14)), DecimalType(19,3), true))) AND (cast(cs_item_rev#26 as decimal(20,3)) <= CheckOverflow((1.10 * promote_precision [...]
+
+(32) Project [codegen id : 15]
+Output [3]: [item_id#13, ss_item_rev#14, cs_item_rev#26]
+Input [4]: [item_id#13, ss_item_rev#14, item_id#25, cs_item_rev#26]
+
+(33) Scan parquet default.web_sales
+Output [3]: [ws_item_sk#28, ws_ext_sales_price#29, ws_sold_date_sk#30]
Batched: true
Location: InMemoryFileIndex []
-PartitionFilters: [isnotnull(ws_sold_date_sk#36), dynamicpruningexpression(true)]
+PartitionFilters: [isnotnull(ws_sold_date_sk#30), dynamicpruningexpression(ws_sold_date_sk#30 IN dynamicpruning#4)]
PushedFilters: [IsNotNull(ws_item_sk)]
ReadSchema: struct<ws_item_sk:int,ws_ext_sales_price:decimal(7,2)>
-(44) ColumnarToRow [codegen id : 13]
-Input [3]: [ws_item_sk#34, ws_ext_sales_price#35, ws_sold_date_sk#36]
+(34) ColumnarToRow [codegen id : 13]
+Input [3]: [ws_item_sk#28, ws_ext_sales_price#29, ws_sold_date_sk#30]
-(45) Filter [codegen id : 13]
-Input [3]: [ws_item_sk#34, ws_ext_sales_price#35, ws_sold_date_sk#36]
-Condition : isnotnull(ws_item_sk#34)
+(35) Filter [codegen id : 13]
+Input [3]: [ws_item_sk#28, ws_ext_sales_price#29, ws_sold_date_sk#30]
+Condition : isnotnull(ws_item_sk#28)
-(46) ReusedExchange [Reuses operator id: 14]
-Output [1]: [d_date_sk#37]
+(36) ReusedExchange [Reuses operator id: 60]
+Output [1]: [d_date_sk#31]
-(47) BroadcastHashJoin [codegen id : 13]
-Left keys [1]: [ws_sold_date_sk#36]
-Right keys [1]: [d_date_sk#37]
+(37) BroadcastHashJoin [codegen id : 13]
+Left keys [1]: [ws_sold_date_sk#30]
+Right keys [1]: [d_date_sk#31]
Join condition: None
-(48) Project [codegen id : 13]
-Output [2]: [ws_item_sk#34, ws_ext_sales_price#35]
-Input [4]: [ws_item_sk#34, ws_ext_sales_price#35, ws_sold_date_sk#36, d_date_sk#37]
+(38) Project [codegen id : 13]
+Output [2]: [ws_item_sk#28, ws_ext_sales_price#29]
+Input [4]: [ws_item_sk#28, ws_ext_sales_price#29, ws_sold_date_sk#30, d_date_sk#31]
-(49) ReusedExchange [Reuses operator id: 20]
-Output [2]: [i_item_sk#38, i_item_id#39]
+(39) ReusedExchange [Reuses operator id: 10]
+Output [2]: [i_item_sk#32, i_item_id#33]
-(50) BroadcastHashJoin [codegen id : 13]
-Left keys [1]: [ws_item_sk#34]
-Right keys [1]: [i_item_sk#38]
+(40) BroadcastHashJoin [codegen id : 13]
+Left keys [1]: [ws_item_sk#28]
+Right keys [1]: [i_item_sk#32]
Join condition: None
-(51) Project [codegen id : 13]
-Output [2]: [ws_ext_sales_price#35, i_item_id#39]
-Input [4]: [ws_item_sk#34, ws_ext_sales_price#35, i_item_sk#38, i_item_id#39]
-
-(52) HashAggregate [codegen id : 13]
-Input [2]: [ws_ext_sales_price#35, i_item_id#39]
-Keys [1]: [i_item_id#39]
-Functions [1]: [partial_sum(UnscaledValue(ws_ext_sales_price#35))]
-Aggregate Attributes [1]: [sum#40]
-Results [2]: [i_item_id#39, sum#41]
-
-(53) Exchange
-Input [2]: [i_item_id#39, sum#41]
-Arguments: hashpartitioning(i_item_id#39, 5), ENSURE_REQUIREMENTS, [id=#42]
-
-(54) HashAggregate [codegen id : 14]
-Input [2]: [i_item_id#39, sum#41]
-Keys [1]: [i_item_id#39]
-Functions [1]: [sum(UnscaledValue(ws_ext_sales_price#35))]
-Aggregate Attributes [1]: [sum(UnscaledValue(ws_ext_sales_price#35))#43]
-Results [2]: [i_item_id#39 AS item_id#44, MakeDecimal(sum(UnscaledValue(ws_ext_sales_price#35))#43,17,2) AS ws_item_rev#45]
-
-(55) Filter [codegen id : 14]
-Input [2]: [item_id#44, ws_item_rev#45]
-Condition : isnotnull(ws_item_rev#45)
-
-(56) BroadcastExchange
-Input [2]: [item_id#44, ws_item_rev#45]
-Arguments: HashedRelationBroadcastMode(List(input[0, string, true]),false), [id=#46]
-
-(57) BroadcastHashJoin [codegen id : 15]
-Left keys [1]: [item_id#19]
-Right keys [1]: [item_id#44]
-Join condition: ((((((((cast(ss_item_rev#20 as decimal(19,3)) >= CheckOverflow((0.90 * promote_precision(ws_item_rev#45)), DecimalType(19,3), true)) AND (cast(ss_item_rev#20 as decimal(20,3)) <= CheckOverflow((1.10 * promote_precision(ws_item_rev#45)), DecimalType(20,3), true))) AND (cast(cs_item_rev#32 as decimal(19,3)) >= CheckOverflow((0.90 * promote_precision(ws_item_rev#45)), DecimalType(19,3), true))) AND (cast(cs_item_rev#32 as decimal(20,3)) <= CheckOverflow((1.10 * promote_preci [...]
-
-(58) Project [codegen id : 15]
-Output [8]: [item_id#19, ss_item_rev#20, CheckOverflow((promote_precision(CheckOverflow((promote_precision(CheckOverflow((promote_precision(cast(ss_item_rev#20 as decimal(19,2))) / promote_precision(CheckOverflow((promote_precision(cast(CheckOverflow((promote_precision(cast(ss_item_rev#20 as decimal(18,2))) + promote_precision(cast(cs_item_rev#32 as decimal(18,2)))), DecimalType(18,2), true) as decimal(19,2))) + promote_precision(cast(ws_item_rev#45 as decimal(19,2)))), DecimalType(19,2) [...]
-Input [5]: [item_id#19, ss_item_rev#20, cs_item_rev#32, item_id#44, ws_item_rev#45]
-
-(59) TakeOrderedAndProject
-Input [8]: [item_id#19, ss_item_rev#20, ss_dev#47, cs_item_rev#32, cs_dev#48, ws_item_rev#45, ws_dev#49, average#50]
-Arguments: 100, [item_id#19 ASC NULLS FIRST, ss_item_rev#20 ASC NULLS FIRST], [item_id#19, ss_item_rev#20, ss_dev#47, cs_item_rev#32, cs_dev#48, ws_item_rev#45, ws_dev#49, average#50]
+(41) Project [codegen id : 13]
+Output [2]: [ws_ext_sales_price#29, i_item_id#33]
+Input [4]: [ws_item_sk#28, ws_ext_sales_price#29, i_item_sk#32, i_item_id#33]
+
+(42) HashAggregate [codegen id : 13]
+Input [2]: [ws_ext_sales_price#29, i_item_id#33]
+Keys [1]: [i_item_id#33]
+Functions [1]: [partial_sum(UnscaledValue(ws_ext_sales_price#29))]
+Aggregate Attributes [1]: [sum#34]
+Results [2]: [i_item_id#33, sum#35]
+
+(43) Exchange
+Input [2]: [i_item_id#33, sum#35]
+Arguments: hashpartitioning(i_item_id#33, 5), ENSURE_REQUIREMENTS, [id=#36]
+
+(44) HashAggregate [codegen id : 14]
+Input [2]: [i_item_id#33, sum#35]
+Keys [1]: [i_item_id#33]
+Functions [1]: [sum(UnscaledValue(ws_ext_sales_price#29))]
+Aggregate Attributes [1]: [sum(UnscaledValue(ws_ext_sales_price#29))#37]
+Results [2]: [i_item_id#33 AS item_id#38, MakeDecimal(sum(UnscaledValue(ws_ext_sales_price#29))#37,17,2) AS ws_item_rev#39]
+
+(45) Filter [codegen id : 14]
+Input [2]: [item_id#38, ws_item_rev#39]
+Condition : isnotnull(ws_item_rev#39)
+
+(46) BroadcastExchange
+Input [2]: [item_id#38, ws_item_rev#39]
+Arguments: HashedRelationBroadcastMode(List(input[0, string, true]),false), [id=#40]
+
+(47) BroadcastHashJoin [codegen id : 15]
+Left keys [1]: [item_id#13]
+Right keys [1]: [item_id#38]
+Join condition: ((((((((cast(ss_item_rev#14 as decimal(19,3)) >= CheckOverflow((0.90 * promote_precision(ws_item_rev#39)), DecimalType(19,3), true)) AND (cast(ss_item_rev#14 as decimal(20,3)) <= CheckOverflow((1.10 * promote_precision(ws_item_rev#39)), DecimalType(20,3), true))) AND (cast(cs_item_rev#26 as decimal(19,3)) >= CheckOverflow((0.90 * promote_precision(ws_item_rev#39)), DecimalType(19,3), true))) AND (cast(cs_item_rev#26 as decimal(20,3)) <= CheckOverflow((1.10 * promote_preci [...]
+
+(48) Project [codegen id : 15]
+Output [8]: [item_id#13, ss_item_rev#14, CheckOverflow((promote_precision(CheckOverflow((promote_precision(CheckOverflow((promote_precision(cast(ss_item_rev#14 as decimal(19,2))) / promote_precision(CheckOverflow((promote_precision(cast(CheckOverflow((promote_precision(cast(ss_item_rev#14 as decimal(18,2))) + promote_precision(cast(cs_item_rev#26 as decimal(18,2)))), DecimalType(18,2), true) as decimal(19,2))) + promote_precision(cast(ws_item_rev#39 as decimal(19,2)))), DecimalType(19,2) [...]
+Input [5]: [item_id#13, ss_item_rev#14, cs_item_rev#26, item_id#38, ws_item_rev#39]
+
+(49) TakeOrderedAndProject
+Input [8]: [item_id#13, ss_item_rev#14, ss_dev#41, cs_item_rev#26, cs_dev#42, ws_item_rev#39, ws_dev#43, average#44]
+Arguments: 100, [item_id#13 ASC NULLS FIRST, ss_item_rev#14 ASC NULLS FIRST], [item_id#13, ss_item_rev#14, ss_dev#41, cs_item_rev#26, cs_dev#42, ws_item_rev#39, ws_dev#43, average#44]
===== Subqueries =====
-Subquery:1 Hosting operator id = 9 Hosting Expression = Subquery scalar-subquery#8, [id=#9]
-* Project (63)
-+- * Filter (62)
- +- * ColumnarToRow (61)
- +- Scan parquet default.date_dim (60)
+Subquery:1 Hosting operator id = 1 Hosting Expression = ss_sold_date_sk#3 IN dynamicpruning#4
+BroadcastExchange (60)
++- * Project (59)
+ +- * BroadcastHashJoin LeftSemi BuildRight (58)
+ :- * Filter (52)
+ : +- * ColumnarToRow (51)
+ : +- Scan parquet default.date_dim (50)
+ +- BroadcastExchange (57)
+ +- * Project (56)
+ +- * Filter (55)
+ +- * ColumnarToRow (54)
+ +- Scan parquet default.date_dim (53)
+
+
+(50) Scan parquet default.date_dim
+Output [2]: [d_date_sk#5, d_date#45]
+Batched: true
+Location [not included in comparison]/{warehouse_dir}/date_dim]
+PushedFilters: [IsNotNull(d_date_sk)]
+ReadSchema: struct<d_date_sk:int,d_date:date>
+
+(51) ColumnarToRow [codegen id : 2]
+Input [2]: [d_date_sk#5, d_date#45]
+
+(52) Filter [codegen id : 2]
+Input [2]: [d_date_sk#5, d_date#45]
+Condition : isnotnull(d_date_sk#5)
+
+(53) Scan parquet default.date_dim
+Output [2]: [d_date#46, d_week_seq#47]
+Batched: true
+Location [not included in comparison]/{warehouse_dir}/date_dim]
+PushedFilters: [IsNotNull(d_week_seq)]
+ReadSchema: struct<d_date:date,d_week_seq:int>
+
+(54) ColumnarToRow [codegen id : 1]
+Input [2]: [d_date#46, d_week_seq#47]
+(55) Filter [codegen id : 1]
+Input [2]: [d_date#46, d_week_seq#47]
+Condition : (isnotnull(d_week_seq#47) AND (d_week_seq#47 = Subquery scalar-subquery#48, [id=#49]))
-(60) Scan parquet default.date_dim
-Output [2]: [d_date#51, d_week_seq#52]
+(56) Project [codegen id : 1]
+Output [1]: [d_date#46]
+Input [2]: [d_date#46, d_week_seq#47]
+
+(57) BroadcastExchange
+Input [1]: [d_date#46]
+Arguments: HashedRelationBroadcastMode(List(input[0, date, true]),false), [id=#50]
+
+(58) BroadcastHashJoin [codegen id : 2]
+Left keys [1]: [d_date#45]
+Right keys [1]: [d_date#46]
+Join condition: None
+
+(59) Project [codegen id : 2]
+Output [1]: [d_date_sk#5]
+Input [2]: [d_date_sk#5, d_date#45]
+
+(60) BroadcastExchange
+Input [1]: [d_date_sk#5]
+Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#51]
+
+Subquery:2 Hosting operator id = 55 Hosting Expression = Subquery scalar-subquery#48, [id=#49]
+* Project (64)
++- * Filter (63)
+ +- * ColumnarToRow (62)
+ +- Scan parquet default.date_dim (61)
+
+
+(61) Scan parquet default.date_dim
+Output [2]: [d_date#52, d_week_seq#53]
Batched: true
Location [not included in comparison]/{warehouse_dir}/date_dim]
PushedFilters: [IsNotNull(d_date), EqualTo(d_date,2000-01-03)]
ReadSchema: struct<d_date:date,d_week_seq:int>
-(61) ColumnarToRow [codegen id : 1]
-Input [2]: [d_date#51, d_week_seq#52]
+(62) ColumnarToRow [codegen id : 1]
+Input [2]: [d_date#52, d_week_seq#53]
+
+(63) Filter [codegen id : 1]
+Input [2]: [d_date#52, d_week_seq#53]
+Condition : (isnotnull(d_date#52) AND (d_date#52 = 2000-01-03))
+
+(64) Project [codegen id : 1]
+Output [1]: [d_week_seq#53]
+Input [2]: [d_date#52, d_week_seq#53]
-(62) Filter [codegen id : 1]
-Input [2]: [d_date#51, d_week_seq#52]
-Condition : (isnotnull(d_date#51) AND (d_date#51 = 2000-01-03))
+Subquery:3 Hosting operator id = 17 Hosting Expression = cs_sold_date_sk#17 IN dynamicpruning#4
-(63) Project [codegen id : 1]
-Output [1]: [d_week_seq#52]
-Input [2]: [d_date#51, d_week_seq#52]
+Subquery:4 Hosting operator id = 33 Hosting Expression = ws_sold_date_sk#30 IN dynamicpruning#4
diff --git a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q58.sf100/simplified.txt b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q58.sf100/simplified.txt
index 57c79b9..6f6eb5d 100644
--- a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q58.sf100/simplified.txt
+++ b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q58.sf100/simplified.txt
@@ -18,30 +18,32 @@ TakeOrderedAndProject [item_id,ss_item_rev,ss_dev,cs_item_rev,cs_dev,ws_item_rev
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_item_sk,ss_ext_sales_price,ss_sold_date_sk]
- InputAdapter
- BroadcastExchange #2
- WholeStageCodegen (2)
- Project [d_date_sk]
- BroadcastHashJoin [d_date,d_date]
- Filter [d_date_sk]
- ColumnarToRow
- InputAdapter
- Scan parquet default.date_dim [d_date_sk,d_date]
- InputAdapter
- BroadcastExchange #3
- WholeStageCodegen (1)
- Project [d_date]
- Filter [d_week_seq]
- Subquery #1
+ SubqueryBroadcast [d_date_sk] #1
+ BroadcastExchange #2
+ WholeStageCodegen (2)
+ Project [d_date_sk]
+ BroadcastHashJoin [d_date,d_date]
+ Filter [d_date_sk]
+ ColumnarToRow
+ InputAdapter
+ Scan parquet default.date_dim [d_date_sk,d_date]
+ InputAdapter
+ BroadcastExchange #3
WholeStageCodegen (1)
- Project [d_week_seq]
- Filter [d_date]
+ Project [d_date]
+ Filter [d_week_seq]
+ Subquery #2
+ WholeStageCodegen (1)
+ Project [d_week_seq]
+ Filter [d_date]
+ ColumnarToRow
+ InputAdapter
+ Scan parquet default.date_dim [d_date,d_week_seq]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date,d_week_seq]
- ColumnarToRow
- InputAdapter
- Scan parquet default.date_dim [d_date,d_week_seq]
+ InputAdapter
+ ReusedExchange [d_date_sk] #2
InputAdapter
BroadcastExchange #4
WholeStageCodegen (3)
@@ -66,6 +68,7 @@ TakeOrderedAndProject [item_id,ss_item_rev,ss_dev,cs_item_rev,cs_dev,ws_item_rev
ColumnarToRow
InputAdapter
Scan parquet default.catalog_sales [cs_item_sk,cs_ext_sales_price,cs_sold_date_sk]
+ ReusedSubquery [d_date_sk] #1
InputAdapter
ReusedExchange [d_date_sk] #2
InputAdapter
@@ -87,6 +90,7 @@ TakeOrderedAndProject [item_id,ss_item_rev,ss_dev,cs_item_rev,cs_dev,ws_item_rev
ColumnarToRow
InputAdapter
Scan parquet default.web_sales [ws_item_sk,ws_ext_sales_price,ws_sold_date_sk]
+ ReusedSubquery [d_date_sk] #1
InputAdapter
ReusedExchange [d_date_sk] #2
InputAdapter
diff --git a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q58/explain.txt b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q58/explain.txt
index a8ce33e..67f19d3 100644
--- a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q58/explain.txt
+++ b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q58/explain.txt
@@ -1,15 +1,15 @@
== Physical Plan ==
-TakeOrderedAndProject (59)
-+- * Project (58)
- +- * BroadcastHashJoin Inner BuildRight (57)
- :- * Project (42)
- : +- * BroadcastHashJoin Inner BuildRight (41)
- : :- * Filter (26)
- : : +- * HashAggregate (25)
- : : +- Exchange (24)
- : : +- * HashAggregate (23)
- : : +- * Project (22)
- : : +- * BroadcastHashJoin Inner BuildRight (21)
+TakeOrderedAndProject (49)
++- * Project (48)
+ +- * BroadcastHashJoin Inner BuildRight (47)
+ :- * Project (32)
+ : +- * BroadcastHashJoin Inner BuildRight (31)
+ : :- * Filter (16)
+ : : +- * HashAggregate (15)
+ : : +- Exchange (14)
+ : : +- * HashAggregate (13)
+ : : +- * Project (12)
+ : : +- * BroadcastHashJoin Inner BuildRight (11)
: : :- * Project (9)
: : : +- * BroadcastHashJoin Inner BuildRight (8)
: : : :- * Filter (3)
@@ -19,52 +19,42 @@ TakeOrderedAndProject (59)
: : : +- * Filter (6)
: : : +- * ColumnarToRow (5)
: : : +- Scan parquet default.item (4)
- : : +- BroadcastExchange (20)
- : : +- * Project (19)
- : : +- * BroadcastHashJoin LeftSemi BuildRight (18)
- : : :- * Filter (12)
- : : : +- * ColumnarToRow (11)
- : : : +- Scan parquet default.date_dim (10)
- : : +- BroadcastExchange (17)
- : : +- * Project (16)
- : : +- * Filter (15)
- : : +- * ColumnarToRow (14)
- : : +- Scan parquet default.date_dim (13)
- : +- BroadcastExchange (40)
- : +- * Filter (39)
- : +- * HashAggregate (38)
- : +- Exchange (37)
- : +- * HashAggregate (36)
- : +- * Project (35)
- : +- * BroadcastHashJoin Inner BuildRight (34)
- : :- * Project (32)
- : : +- * BroadcastHashJoin Inner BuildRight (31)
- : : :- * Filter (29)
- : : : +- * ColumnarToRow (28)
- : : : +- Scan parquet default.catalog_sales (27)
- : : +- ReusedExchange (30)
- : +- ReusedExchange (33)
- +- BroadcastExchange (56)
- +- * Filter (55)
- +- * HashAggregate (54)
- +- Exchange (53)
- +- * HashAggregate (52)
- +- * Project (51)
- +- * BroadcastHashJoin Inner BuildRight (50)
- :- * Project (48)
- : +- * BroadcastHashJoin Inner BuildRight (47)
- : :- * Filter (45)
- : : +- * ColumnarToRow (44)
- : : +- Scan parquet default.web_sales (43)
- : +- ReusedExchange (46)
- +- ReusedExchange (49)
+ : : +- ReusedExchange (10)
+ : +- BroadcastExchange (30)
+ : +- * Filter (29)
+ : +- * HashAggregate (28)
+ : +- Exchange (27)
+ : +- * HashAggregate (26)
+ : +- * Project (25)
+ : +- * BroadcastHashJoin Inner BuildRight (24)
+ : :- * Project (22)
+ : : +- * BroadcastHashJoin Inner BuildRight (21)
+ : : :- * Filter (19)
+ : : : +- * ColumnarToRow (18)
+ : : : +- Scan parquet default.catalog_sales (17)
+ : : +- ReusedExchange (20)
+ : +- ReusedExchange (23)
+ +- BroadcastExchange (46)
+ +- * Filter (45)
+ +- * HashAggregate (44)
+ +- Exchange (43)
+ +- * HashAggregate (42)
+ +- * Project (41)
+ +- * BroadcastHashJoin Inner BuildRight (40)
+ :- * Project (38)
+ : +- * BroadcastHashJoin Inner BuildRight (37)
+ : :- * Filter (35)
+ : : +- * ColumnarToRow (34)
+ : : +- Scan parquet default.web_sales (33)
+ : +- ReusedExchange (36)
+ +- ReusedExchange (39)
(1) Scan parquet default.store_sales
Output [3]: [ss_item_sk#1, ss_ext_sales_price#2, ss_sold_date_sk#3]
Batched: true
Location: InMemoryFileIndex []
-PartitionFilters: [isnotnull(ss_sold_date_sk#3), dynamicpruningexpression(true)]
+PartitionFilters: [isnotnull(ss_sold_date_sk#3), dynamicpruningexpression(ss_sold_date_sk#3 IN dynamicpruning#4)]
PushedFilters: [IsNotNull(ss_item_sk)]
ReadSchema: struct<ss_item_sk:int,ss_ext_sales_price:decimal(7,2)>
@@ -76,289 +66,310 @@ Input [3]: [ss_item_sk#1, ss_ext_sales_price#2, ss_sold_date_sk#3]
Condition : isnotnull(ss_item_sk#1)
(4) Scan parquet default.item
-Output [2]: [i_item_sk#4, i_item_id#5]
+Output [2]: [i_item_sk#5, i_item_id#6]
Batched: true
Location [not included in comparison]/{warehouse_dir}/item]
PushedFilters: [IsNotNull(i_item_sk), IsNotNull(i_item_id)]
ReadSchema: struct<i_item_sk:int,i_item_id:string>
(5) ColumnarToRow [codegen id : 1]
-Input [2]: [i_item_sk#4, i_item_id#5]
+Input [2]: [i_item_sk#5, i_item_id#6]
(6) Filter [codegen id : 1]
-Input [2]: [i_item_sk#4, i_item_id#5]
-Condition : (isnotnull(i_item_sk#4) AND isnotnull(i_item_id#5))
+Input [2]: [i_item_sk#5, i_item_id#6]
+Condition : (isnotnull(i_item_sk#5) AND isnotnull(i_item_id#6))
(7) BroadcastExchange
-Input [2]: [i_item_sk#4, i_item_id#5]
-Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#6]
+Input [2]: [i_item_sk#5, i_item_id#6]
+Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#7]
(8) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_item_sk#1]
-Right keys [1]: [i_item_sk#4]
+Right keys [1]: [i_item_sk#5]
Join condition: None
(9) Project [codegen id : 4]
-Output [3]: [ss_ext_sales_price#2, ss_sold_date_sk#3, i_item_id#5]
-Input [5]: [ss_item_sk#1, ss_ext_sales_price#2, ss_sold_date_sk#3, i_item_sk#4, i_item_id#5]
+Output [3]: [ss_ext_sales_price#2, ss_sold_date_sk#3, i_item_id#6]
+Input [5]: [ss_item_sk#1, ss_ext_sales_price#2, ss_sold_date_sk#3, i_item_sk#5, i_item_id#6]
-(10) Scan parquet default.date_dim
-Output [2]: [d_date_sk#7, d_date#8]
-Batched: true
-Location [not included in comparison]/{warehouse_dir}/date_dim]
-PushedFilters: [IsNotNull(d_date_sk)]
-ReadSchema: struct<d_date_sk:int,d_date:date>
-
-(11) ColumnarToRow [codegen id : 3]
-Input [2]: [d_date_sk#7, d_date#8]
-
-(12) Filter [codegen id : 3]
-Input [2]: [d_date_sk#7, d_date#8]
-Condition : isnotnull(d_date_sk#7)
-
-(13) Scan parquet default.date_dim
-Output [2]: [d_date#9, d_week_seq#10]
-Batched: true
-Location [not included in comparison]/{warehouse_dir}/date_dim]
-PushedFilters: [IsNotNull(d_week_seq)]
-ReadSchema: struct<d_date:date,d_week_seq:int>
+(10) ReusedExchange [Reuses operator id: 60]
+Output [1]: [d_date_sk#8]
-(14) ColumnarToRow [codegen id : 2]
-Input [2]: [d_date#9, d_week_seq#10]
-
-(15) Filter [codegen id : 2]
-Input [2]: [d_date#9, d_week_seq#10]
-Condition : (isnotnull(d_week_seq#10) AND (d_week_seq#10 = Subquery scalar-subquery#11, [id=#12]))
-
-(16) Project [codegen id : 2]
-Output [1]: [d_date#9]
-Input [2]: [d_date#9, d_week_seq#10]
-
-(17) BroadcastExchange
-Input [1]: [d_date#9]
-Arguments: HashedRelationBroadcastMode(List(input[0, date, true]),false), [id=#13]
-
-(18) BroadcastHashJoin [codegen id : 3]
-Left keys [1]: [d_date#8]
-Right keys [1]: [d_date#9]
-Join condition: None
-
-(19) Project [codegen id : 3]
-Output [1]: [d_date_sk#7]
-Input [2]: [d_date_sk#7, d_date#8]
-
-(20) BroadcastExchange
-Input [1]: [d_date_sk#7]
-Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#14]
-
-(21) BroadcastHashJoin [codegen id : 4]
+(11) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_sold_date_sk#3]
-Right keys [1]: [d_date_sk#7]
+Right keys [1]: [d_date_sk#8]
Join condition: None
-(22) Project [codegen id : 4]
-Output [2]: [ss_ext_sales_price#2, i_item_id#5]
-Input [4]: [ss_ext_sales_price#2, ss_sold_date_sk#3, i_item_id#5, d_date_sk#7]
+(12) Project [codegen id : 4]
+Output [2]: [ss_ext_sales_price#2, i_item_id#6]
+Input [4]: [ss_ext_sales_price#2, ss_sold_date_sk#3, i_item_id#6, d_date_sk#8]
-(23) HashAggregate [codegen id : 4]
-Input [2]: [ss_ext_sales_price#2, i_item_id#5]
-Keys [1]: [i_item_id#5]
+(13) HashAggregate [codegen id : 4]
+Input [2]: [ss_ext_sales_price#2, i_item_id#6]
+Keys [1]: [i_item_id#6]
Functions [1]: [partial_sum(UnscaledValue(ss_ext_sales_price#2))]
-Aggregate Attributes [1]: [sum#15]
-Results [2]: [i_item_id#5, sum#16]
+Aggregate Attributes [1]: [sum#9]
+Results [2]: [i_item_id#6, sum#10]
-(24) Exchange
-Input [2]: [i_item_id#5, sum#16]
-Arguments: hashpartitioning(i_item_id#5, 5), ENSURE_REQUIREMENTS, [id=#17]
+(14) Exchange
+Input [2]: [i_item_id#6, sum#10]
+Arguments: hashpartitioning(i_item_id#6, 5), ENSURE_REQUIREMENTS, [id=#11]
-(25) HashAggregate [codegen id : 15]
-Input [2]: [i_item_id#5, sum#16]
-Keys [1]: [i_item_id#5]
+(15) HashAggregate [codegen id : 15]
+Input [2]: [i_item_id#6, sum#10]
+Keys [1]: [i_item_id#6]
Functions [1]: [sum(UnscaledValue(ss_ext_sales_price#2))]
-Aggregate Attributes [1]: [sum(UnscaledValue(ss_ext_sales_price#2))#18]
-Results [2]: [i_item_id#5 AS item_id#19, MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#2))#18,17,2) AS ss_item_rev#20]
+Aggregate Attributes [1]: [sum(UnscaledValue(ss_ext_sales_price#2))#12]
+Results [2]: [i_item_id#6 AS item_id#13, MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#2))#12,17,2) AS ss_item_rev#14]
-(26) Filter [codegen id : 15]
-Input [2]: [item_id#19, ss_item_rev#20]
-Condition : isnotnull(ss_item_rev#20)
+(16) Filter [codegen id : 15]
+Input [2]: [item_id#13, ss_item_rev#14]
+Condition : isnotnull(ss_item_rev#14)
-(27) Scan parquet default.catalog_sales
-Output [3]: [cs_item_sk#21, cs_ext_sales_price#22, cs_sold_date_sk#23]
+(17) Scan parquet default.catalog_sales
+Output [3]: [cs_item_sk#15, cs_ext_sales_price#16, cs_sold_date_sk#17]
Batched: true
Location: InMemoryFileIndex []
-PartitionFilters: [isnotnull(cs_sold_date_sk#23), dynamicpruningexpression(true)]
+PartitionFilters: [isnotnull(cs_sold_date_sk#17), dynamicpruningexpression(cs_sold_date_sk#17 IN dynamicpruning#4)]
PushedFilters: [IsNotNull(cs_item_sk)]
ReadSchema: struct<cs_item_sk:int,cs_ext_sales_price:decimal(7,2)>
-(28) ColumnarToRow [codegen id : 8]
-Input [3]: [cs_item_sk#21, cs_ext_sales_price#22, cs_sold_date_sk#23]
+(18) ColumnarToRow [codegen id : 8]
+Input [3]: [cs_item_sk#15, cs_ext_sales_price#16, cs_sold_date_sk#17]
-(29) Filter [codegen id : 8]
-Input [3]: [cs_item_sk#21, cs_ext_sales_price#22, cs_sold_date_sk#23]
-Condition : isnotnull(cs_item_sk#21)
+(19) Filter [codegen id : 8]
+Input [3]: [cs_item_sk#15, cs_ext_sales_price#16, cs_sold_date_sk#17]
+Condition : isnotnull(cs_item_sk#15)
-(30) ReusedExchange [Reuses operator id: 7]
-Output [2]: [i_item_sk#24, i_item_id#25]
+(20) ReusedExchange [Reuses operator id: 7]
+Output [2]: [i_item_sk#18, i_item_id#19]
-(31) BroadcastHashJoin [codegen id : 8]
-Left keys [1]: [cs_item_sk#21]
-Right keys [1]: [i_item_sk#24]
+(21) BroadcastHashJoin [codegen id : 8]
+Left keys [1]: [cs_item_sk#15]
+Right keys [1]: [i_item_sk#18]
Join condition: None
-(32) Project [codegen id : 8]
-Output [3]: [cs_ext_sales_price#22, cs_sold_date_sk#23, i_item_id#25]
-Input [5]: [cs_item_sk#21, cs_ext_sales_price#22, cs_sold_date_sk#23, i_item_sk#24, i_item_id#25]
+(22) Project [codegen id : 8]
+Output [3]: [cs_ext_sales_price#16, cs_sold_date_sk#17, i_item_id#19]
+Input [5]: [cs_item_sk#15, cs_ext_sales_price#16, cs_sold_date_sk#17, i_item_sk#18, i_item_id#19]
-(33) ReusedExchange [Reuses operator id: 20]
-Output [1]: [d_date_sk#26]
+(23) ReusedExchange [Reuses operator id: 60]
+Output [1]: [d_date_sk#20]
-(34) BroadcastHashJoin [codegen id : 8]
-Left keys [1]: [cs_sold_date_sk#23]
-Right keys [1]: [d_date_sk#26]
+(24) BroadcastHashJoin [codegen id : 8]
+Left keys [1]: [cs_sold_date_sk#17]
+Right keys [1]: [d_date_sk#20]
Join condition: None
-(35) Project [codegen id : 8]
-Output [2]: [cs_ext_sales_price#22, i_item_id#25]
-Input [4]: [cs_ext_sales_price#22, cs_sold_date_sk#23, i_item_id#25, d_date_sk#26]
-
-(36) HashAggregate [codegen id : 8]
-Input [2]: [cs_ext_sales_price#22, i_item_id#25]
-Keys [1]: [i_item_id#25]
-Functions [1]: [partial_sum(UnscaledValue(cs_ext_sales_price#22))]
-Aggregate Attributes [1]: [sum#27]
-Results [2]: [i_item_id#25, sum#28]
-
-(37) Exchange
-Input [2]: [i_item_id#25, sum#28]
-Arguments: hashpartitioning(i_item_id#25, 5), ENSURE_REQUIREMENTS, [id=#29]
-
-(38) HashAggregate [codegen id : 9]
-Input [2]: [i_item_id#25, sum#28]
-Keys [1]: [i_item_id#25]
-Functions [1]: [sum(UnscaledValue(cs_ext_sales_price#22))]
-Aggregate Attributes [1]: [sum(UnscaledValue(cs_ext_sales_price#22))#30]
-Results [2]: [i_item_id#25 AS item_id#31, MakeDecimal(sum(UnscaledValue(cs_ext_sales_price#22))#30,17,2) AS cs_item_rev#32]
-
-(39) Filter [codegen id : 9]
-Input [2]: [item_id#31, cs_item_rev#32]
-Condition : isnotnull(cs_item_rev#32)
-
-(40) BroadcastExchange
-Input [2]: [item_id#31, cs_item_rev#32]
-Arguments: HashedRelationBroadcastMode(List(input[0, string, true]),false), [id=#33]
-
-(41) BroadcastHashJoin [codegen id : 15]
-Left keys [1]: [item_id#19]
-Right keys [1]: [item_id#31]
-Join condition: ((((cast(ss_item_rev#20 as decimal(19,3)) >= CheckOverflow((0.90 * promote_precision(cs_item_rev#32)), DecimalType(19,3), true)) AND (cast(ss_item_rev#20 as decimal(20,3)) <= CheckOverflow((1.10 * promote_precision(cs_item_rev#32)), DecimalType(20,3), true))) AND (cast(cs_item_rev#32 as decimal(19,3)) >= CheckOverflow((0.90 * promote_precision(ss_item_rev#20)), DecimalType(19,3), true))) AND (cast(cs_item_rev#32 as decimal(20,3)) <= CheckOverflow((1.10 * promote_precision [...]
-
-(42) Project [codegen id : 15]
-Output [3]: [item_id#19, ss_item_rev#20, cs_item_rev#32]
-Input [4]: [item_id#19, ss_item_rev#20, item_id#31, cs_item_rev#32]
-
-(43) Scan parquet default.web_sales
-Output [3]: [ws_item_sk#34, ws_ext_sales_price#35, ws_sold_date_sk#36]
+(25) Project [codegen id : 8]
+Output [2]: [cs_ext_sales_price#16, i_item_id#19]
+Input [4]: [cs_ext_sales_price#16, cs_sold_date_sk#17, i_item_id#19, d_date_sk#20]
+
+(26) HashAggregate [codegen id : 8]
+Input [2]: [cs_ext_sales_price#16, i_item_id#19]
+Keys [1]: [i_item_id#19]
+Functions [1]: [partial_sum(UnscaledValue(cs_ext_sales_price#16))]
+Aggregate Attributes [1]: [sum#21]
+Results [2]: [i_item_id#19, sum#22]
+
+(27) Exchange
+Input [2]: [i_item_id#19, sum#22]
+Arguments: hashpartitioning(i_item_id#19, 5), ENSURE_REQUIREMENTS, [id=#23]
+
+(28) HashAggregate [codegen id : 9]
+Input [2]: [i_item_id#19, sum#22]
+Keys [1]: [i_item_id#19]
+Functions [1]: [sum(UnscaledValue(cs_ext_sales_price#16))]
+Aggregate Attributes [1]: [sum(UnscaledValue(cs_ext_sales_price#16))#24]
+Results [2]: [i_item_id#19 AS item_id#25, MakeDecimal(sum(UnscaledValue(cs_ext_sales_price#16))#24,17,2) AS cs_item_rev#26]
+
+(29) Filter [codegen id : 9]
+Input [2]: [item_id#25, cs_item_rev#26]
+Condition : isnotnull(cs_item_rev#26)
+
+(30) BroadcastExchange
+Input [2]: [item_id#25, cs_item_rev#26]
+Arguments: HashedRelationBroadcastMode(List(input[0, string, true]),false), [id=#27]
+
+(31) BroadcastHashJoin [codegen id : 15]
+Left keys [1]: [item_id#13]
+Right keys [1]: [item_id#25]
+Join condition: ((((cast(ss_item_rev#14 as decimal(19,3)) >= CheckOverflow((0.90 * promote_precision(cs_item_rev#26)), DecimalType(19,3), true)) AND (cast(ss_item_rev#14 as decimal(20,3)) <= CheckOverflow((1.10 * promote_precision(cs_item_rev#26)), DecimalType(20,3), true))) AND (cast(cs_item_rev#26 as decimal(19,3)) >= CheckOverflow((0.90 * promote_precision(ss_item_rev#14)), DecimalType(19,3), true))) AND (cast(cs_item_rev#26 as decimal(20,3)) <= CheckOverflow((1.10 * promote_precision [...]
+
+(32) Project [codegen id : 15]
+Output [3]: [item_id#13, ss_item_rev#14, cs_item_rev#26]
+Input [4]: [item_id#13, ss_item_rev#14, item_id#25, cs_item_rev#26]
+
+(33) Scan parquet default.web_sales
+Output [3]: [ws_item_sk#28, ws_ext_sales_price#29, ws_sold_date_sk#30]
Batched: true
Location: InMemoryFileIndex []
-PartitionFilters: [isnotnull(ws_sold_date_sk#36), dynamicpruningexpression(true)]
+PartitionFilters: [isnotnull(ws_sold_date_sk#30), dynamicpruningexpression(ws_sold_date_sk#30 IN dynamicpruning#4)]
PushedFilters: [IsNotNull(ws_item_sk)]
ReadSchema: struct<ws_item_sk:int,ws_ext_sales_price:decimal(7,2)>
-(44) ColumnarToRow [codegen id : 13]
-Input [3]: [ws_item_sk#34, ws_ext_sales_price#35, ws_sold_date_sk#36]
+(34) ColumnarToRow [codegen id : 13]
+Input [3]: [ws_item_sk#28, ws_ext_sales_price#29, ws_sold_date_sk#30]
-(45) Filter [codegen id : 13]
-Input [3]: [ws_item_sk#34, ws_ext_sales_price#35, ws_sold_date_sk#36]
-Condition : isnotnull(ws_item_sk#34)
+(35) Filter [codegen id : 13]
+Input [3]: [ws_item_sk#28, ws_ext_sales_price#29, ws_sold_date_sk#30]
+Condition : isnotnull(ws_item_sk#28)
-(46) ReusedExchange [Reuses operator id: 7]
-Output [2]: [i_item_sk#37, i_item_id#38]
+(36) ReusedExchange [Reuses operator id: 7]
+Output [2]: [i_item_sk#31, i_item_id#32]
-(47) BroadcastHashJoin [codegen id : 13]
-Left keys [1]: [ws_item_sk#34]
-Right keys [1]: [i_item_sk#37]
+(37) BroadcastHashJoin [codegen id : 13]
+Left keys [1]: [ws_item_sk#28]
+Right keys [1]: [i_item_sk#31]
Join condition: None
-(48) Project [codegen id : 13]
-Output [3]: [ws_ext_sales_price#35, ws_sold_date_sk#36, i_item_id#38]
-Input [5]: [ws_item_sk#34, ws_ext_sales_price#35, ws_sold_date_sk#36, i_item_sk#37, i_item_id#38]
+(38) Project [codegen id : 13]
+Output [3]: [ws_ext_sales_price#29, ws_sold_date_sk#30, i_item_id#32]
+Input [5]: [ws_item_sk#28, ws_ext_sales_price#29, ws_sold_date_sk#30, i_item_sk#31, i_item_id#32]
-(49) ReusedExchange [Reuses operator id: 20]
-Output [1]: [d_date_sk#39]
+(39) ReusedExchange [Reuses operator id: 60]
+Output [1]: [d_date_sk#33]
-(50) BroadcastHashJoin [codegen id : 13]
-Left keys [1]: [ws_sold_date_sk#36]
-Right keys [1]: [d_date_sk#39]
+(40) BroadcastHashJoin [codegen id : 13]
+Left keys [1]: [ws_sold_date_sk#30]
+Right keys [1]: [d_date_sk#33]
Join condition: None
-(51) Project [codegen id : 13]
-Output [2]: [ws_ext_sales_price#35, i_item_id#38]
-Input [4]: [ws_ext_sales_price#35, ws_sold_date_sk#36, i_item_id#38, d_date_sk#39]
-
-(52) HashAggregate [codegen id : 13]
-Input [2]: [ws_ext_sales_price#35, i_item_id#38]
-Keys [1]: [i_item_id#38]
-Functions [1]: [partial_sum(UnscaledValue(ws_ext_sales_price#35))]
-Aggregate Attributes [1]: [sum#40]
-Results [2]: [i_item_id#38, sum#41]
-
-(53) Exchange
-Input [2]: [i_item_id#38, sum#41]
-Arguments: hashpartitioning(i_item_id#38, 5), ENSURE_REQUIREMENTS, [id=#42]
-
-(54) HashAggregate [codegen id : 14]
-Input [2]: [i_item_id#38, sum#41]
-Keys [1]: [i_item_id#38]
-Functions [1]: [sum(UnscaledValue(ws_ext_sales_price#35))]
-Aggregate Attributes [1]: [sum(UnscaledValue(ws_ext_sales_price#35))#43]
-Results [2]: [i_item_id#38 AS item_id#44, MakeDecimal(sum(UnscaledValue(ws_ext_sales_price#35))#43,17,2) AS ws_item_rev#45]
-
-(55) Filter [codegen id : 14]
-Input [2]: [item_id#44, ws_item_rev#45]
-Condition : isnotnull(ws_item_rev#45)
-
-(56) BroadcastExchange
-Input [2]: [item_id#44, ws_item_rev#45]
-Arguments: HashedRelationBroadcastMode(List(input[0, string, true]),false), [id=#46]
-
-(57) BroadcastHashJoin [codegen id : 15]
-Left keys [1]: [item_id#19]
-Right keys [1]: [item_id#44]
-Join condition: ((((((((cast(ss_item_rev#20 as decimal(19,3)) >= CheckOverflow((0.90 * promote_precision(ws_item_rev#45)), DecimalType(19,3), true)) AND (cast(ss_item_rev#20 as decimal(20,3)) <= CheckOverflow((1.10 * promote_precision(ws_item_rev#45)), DecimalType(20,3), true))) AND (cast(cs_item_rev#32 as decimal(19,3)) >= CheckOverflow((0.90 * promote_precision(ws_item_rev#45)), DecimalType(19,3), true))) AND (cast(cs_item_rev#32 as decimal(20,3)) <= CheckOverflow((1.10 * promote_preci [...]
-
-(58) Project [codegen id : 15]
-Output [8]: [item_id#19, ss_item_rev#20, CheckOverflow((promote_precision(CheckOverflow((promote_precision(CheckOverflow((promote_precision(cast(ss_item_rev#20 as decimal(19,2))) / promote_precision(CheckOverflow((promote_precision(cast(CheckOverflow((promote_precision(cast(ss_item_rev#20 as decimal(18,2))) + promote_precision(cast(cs_item_rev#32 as decimal(18,2)))), DecimalType(18,2), true) as decimal(19,2))) + promote_precision(cast(ws_item_rev#45 as decimal(19,2)))), DecimalType(19,2) [...]
-Input [5]: [item_id#19, ss_item_rev#20, cs_item_rev#32, item_id#44, ws_item_rev#45]
-
-(59) TakeOrderedAndProject
-Input [8]: [item_id#19, ss_item_rev#20, ss_dev#47, cs_item_rev#32, cs_dev#48, ws_item_rev#45, ws_dev#49, average#50]
-Arguments: 100, [item_id#19 ASC NULLS FIRST, ss_item_rev#20 ASC NULLS FIRST], [item_id#19, ss_item_rev#20, ss_dev#47, cs_item_rev#32, cs_dev#48, ws_item_rev#45, ws_dev#49, average#50]
+(41) Project [codegen id : 13]
+Output [2]: [ws_ext_sales_price#29, i_item_id#32]
+Input [4]: [ws_ext_sales_price#29, ws_sold_date_sk#30, i_item_id#32, d_date_sk#33]
+
+(42) HashAggregate [codegen id : 13]
+Input [2]: [ws_ext_sales_price#29, i_item_id#32]
+Keys [1]: [i_item_id#32]
+Functions [1]: [partial_sum(UnscaledValue(ws_ext_sales_price#29))]
+Aggregate Attributes [1]: [sum#34]
+Results [2]: [i_item_id#32, sum#35]
+
+(43) Exchange
+Input [2]: [i_item_id#32, sum#35]
+Arguments: hashpartitioning(i_item_id#32, 5), ENSURE_REQUIREMENTS, [id=#36]
+
+(44) HashAggregate [codegen id : 14]
+Input [2]: [i_item_id#32, sum#35]
+Keys [1]: [i_item_id#32]
+Functions [1]: [sum(UnscaledValue(ws_ext_sales_price#29))]
+Aggregate Attributes [1]: [sum(UnscaledValue(ws_ext_sales_price#29))#37]
+Results [2]: [i_item_id#32 AS item_id#38, MakeDecimal(sum(UnscaledValue(ws_ext_sales_price#29))#37,17,2) AS ws_item_rev#39]
+
+(45) Filter [codegen id : 14]
+Input [2]: [item_id#38, ws_item_rev#39]
+Condition : isnotnull(ws_item_rev#39)
+
+(46) BroadcastExchange
+Input [2]: [item_id#38, ws_item_rev#39]
+Arguments: HashedRelationBroadcastMode(List(input[0, string, true]),false), [id=#40]
+
+(47) BroadcastHashJoin [codegen id : 15]
+Left keys [1]: [item_id#13]
+Right keys [1]: [item_id#38]
+Join condition: ((((((((cast(ss_item_rev#14 as decimal(19,3)) >= CheckOverflow((0.90 * promote_precision(ws_item_rev#39)), DecimalType(19,3), true)) AND (cast(ss_item_rev#14 as decimal(20,3)) <= CheckOverflow((1.10 * promote_precision(ws_item_rev#39)), DecimalType(20,3), true))) AND (cast(cs_item_rev#26 as decimal(19,3)) >= CheckOverflow((0.90 * promote_precision(ws_item_rev#39)), DecimalType(19,3), true))) AND (cast(cs_item_rev#26 as decimal(20,3)) <= CheckOverflow((1.10 * promote_preci [...]
+
+(48) Project [codegen id : 15]
+Output [8]: [item_id#13, ss_item_rev#14, CheckOverflow((promote_precision(CheckOverflow((promote_precision(CheckOverflow((promote_precision(cast(ss_item_rev#14 as decimal(19,2))) / promote_precision(CheckOverflow((promote_precision(cast(CheckOverflow((promote_precision(cast(ss_item_rev#14 as decimal(18,2))) + promote_precision(cast(cs_item_rev#26 as decimal(18,2)))), DecimalType(18,2), true) as decimal(19,2))) + promote_precision(cast(ws_item_rev#39 as decimal(19,2)))), DecimalType(19,2) [...]
+Input [5]: [item_id#13, ss_item_rev#14, cs_item_rev#26, item_id#38, ws_item_rev#39]
+
+(49) TakeOrderedAndProject
+Input [8]: [item_id#13, ss_item_rev#14, ss_dev#41, cs_item_rev#26, cs_dev#42, ws_item_rev#39, ws_dev#43, average#44]
+Arguments: 100, [item_id#13 ASC NULLS FIRST, ss_item_rev#14 ASC NULLS FIRST], [item_id#13, ss_item_rev#14, ss_dev#41, cs_item_rev#26, cs_dev#42, ws_item_rev#39, ws_dev#43, average#44]
===== Subqueries =====
-Subquery:1 Hosting operator id = 15 Hosting Expression = Subquery scalar-subquery#11, [id=#12]
-* Project (63)
-+- * Filter (62)
- +- * ColumnarToRow (61)
- +- Scan parquet default.date_dim (60)
+Subquery:1 Hosting operator id = 1 Hosting Expression = ss_sold_date_sk#3 IN dynamicpruning#4
+BroadcastExchange (60)
++- * Project (59)
+ +- * BroadcastHashJoin LeftSemi BuildRight (58)
+ :- * Filter (52)
+ : +- * ColumnarToRow (51)
+ : +- Scan parquet default.date_dim (50)
+ +- BroadcastExchange (57)
+ +- * Project (56)
+ +- * Filter (55)
+ +- * ColumnarToRow (54)
+ +- Scan parquet default.date_dim (53)
+
+
+(50) Scan parquet default.date_dim
+Output [2]: [d_date_sk#8, d_date#45]
+Batched: true
+Location [not included in comparison]/{warehouse_dir}/date_dim]
+PushedFilters: [IsNotNull(d_date_sk)]
+ReadSchema: struct<d_date_sk:int,d_date:date>
+
+(51) ColumnarToRow [codegen id : 2]
+Input [2]: [d_date_sk#8, d_date#45]
+
+(52) Filter [codegen id : 2]
+Input [2]: [d_date_sk#8, d_date#45]
+Condition : isnotnull(d_date_sk#8)
+
+(53) Scan parquet default.date_dim
+Output [2]: [d_date#46, d_week_seq#47]
+Batched: true
+Location [not included in comparison]/{warehouse_dir}/date_dim]
+PushedFilters: [IsNotNull(d_week_seq)]
+ReadSchema: struct<d_date:date,d_week_seq:int>
+
+(54) ColumnarToRow [codegen id : 1]
+Input [2]: [d_date#46, d_week_seq#47]
+(55) Filter [codegen id : 1]
+Input [2]: [d_date#46, d_week_seq#47]
+Condition : (isnotnull(d_week_seq#47) AND (d_week_seq#47 = Subquery scalar-subquery#48, [id=#49]))
-(60) Scan parquet default.date_dim
-Output [2]: [d_date#51, d_week_seq#52]
+(56) Project [codegen id : 1]
+Output [1]: [d_date#46]
+Input [2]: [d_date#46, d_week_seq#47]
+
+(57) BroadcastExchange
+Input [1]: [d_date#46]
+Arguments: HashedRelationBroadcastMode(List(input[0, date, true]),false), [id=#50]
+
+(58) BroadcastHashJoin [codegen id : 2]
+Left keys [1]: [d_date#45]
+Right keys [1]: [d_date#46]
+Join condition: None
+
+(59) Project [codegen id : 2]
+Output [1]: [d_date_sk#8]
+Input [2]: [d_date_sk#8, d_date#45]
+
+(60) BroadcastExchange
+Input [1]: [d_date_sk#8]
+Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#51]
+
+Subquery:2 Hosting operator id = 55 Hosting Expression = Subquery scalar-subquery#48, [id=#49]
+* Project (64)
++- * Filter (63)
+ +- * ColumnarToRow (62)
+ +- Scan parquet default.date_dim (61)
+
+
+(61) Scan parquet default.date_dim
+Output [2]: [d_date#52, d_week_seq#53]
Batched: true
Location [not included in comparison]/{warehouse_dir}/date_dim]
PushedFilters: [IsNotNull(d_date), EqualTo(d_date,2000-01-03)]
ReadSchema: struct<d_date:date,d_week_seq:int>
-(61) ColumnarToRow [codegen id : 1]
-Input [2]: [d_date#51, d_week_seq#52]
+(62) ColumnarToRow [codegen id : 1]
+Input [2]: [d_date#52, d_week_seq#53]
+
+(63) Filter [codegen id : 1]
+Input [2]: [d_date#52, d_week_seq#53]
+Condition : (isnotnull(d_date#52) AND (d_date#52 = 2000-01-03))
+
+(64) Project [codegen id : 1]
+Output [1]: [d_week_seq#53]
+Input [2]: [d_date#52, d_week_seq#53]
-(62) Filter [codegen id : 1]
-Input [2]: [d_date#51, d_week_seq#52]
-Condition : (isnotnull(d_date#51) AND (d_date#51 = 2000-01-03))
+Subquery:3 Hosting operator id = 17 Hosting Expression = cs_sold_date_sk#17 IN dynamicpruning#4
-(63) Project [codegen id : 1]
-Output [1]: [d_week_seq#52]
-Input [2]: [d_date#51, d_week_seq#52]
+Subquery:4 Hosting operator id = 33 Hosting Expression = ws_sold_date_sk#30 IN dynamicpruning#4
diff --git a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q58/simplified.txt b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q58/simplified.txt
index 1fb6832..cb69889 100644
--- a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q58/simplified.txt
+++ b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q58/simplified.txt
@@ -18,37 +18,39 @@ TakeOrderedAndProject [item_id,ss_item_rev,ss_dev,cs_item_rev,cs_dev,ws_item_rev
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_item_sk,ss_ext_sales_price,ss_sold_date_sk]
+ SubqueryBroadcast [d_date_sk] #1
+ BroadcastExchange #2
+ WholeStageCodegen (2)
+ Project [d_date_sk]
+ BroadcastHashJoin [d_date,d_date]
+ Filter [d_date_sk]
+ ColumnarToRow
+ InputAdapter
+ Scan parquet default.date_dim [d_date_sk,d_date]
+ InputAdapter
+ BroadcastExchange #3
+ WholeStageCodegen (1)
+ Project [d_date]
+ Filter [d_week_seq]
+ Subquery #2
+ WholeStageCodegen (1)
+ Project [d_week_seq]
+ Filter [d_date]
+ ColumnarToRow
+ InputAdapter
+ Scan parquet default.date_dim [d_date,d_week_seq]
+ ColumnarToRow
+ InputAdapter
+ Scan parquet default.date_dim [d_date,d_week_seq]
InputAdapter
- BroadcastExchange #2
+ BroadcastExchange #4
WholeStageCodegen (1)
Filter [i_item_sk,i_item_id]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_item_sk,i_item_id]
InputAdapter
- BroadcastExchange #3
- WholeStageCodegen (3)
- Project [d_date_sk]
- BroadcastHashJoin [d_date,d_date]
- Filter [d_date_sk]
- ColumnarToRow
- InputAdapter
- Scan parquet default.date_dim [d_date_sk,d_date]
- InputAdapter
- BroadcastExchange #4
- WholeStageCodegen (2)
- Project [d_date]
- Filter [d_week_seq]
- Subquery #1
- WholeStageCodegen (1)
- Project [d_week_seq]
- Filter [d_date]
- ColumnarToRow
- InputAdapter
- Scan parquet default.date_dim [d_date,d_week_seq]
- ColumnarToRow
- InputAdapter
- Scan parquet default.date_dim [d_date,d_week_seq]
+ ReusedExchange [d_date_sk] #2
InputAdapter
BroadcastExchange #5
WholeStageCodegen (9)
@@ -66,10 +68,11 @@ TakeOrderedAndProject [item_id,ss_item_rev,ss_dev,cs_item_rev,cs_dev,ws_item_rev
ColumnarToRow
InputAdapter
Scan parquet default.catalog_sales [cs_item_sk,cs_ext_sales_price,cs_sold_date_sk]
+ ReusedSubquery [d_date_sk] #1
InputAdapter
- ReusedExchange [i_item_sk,i_item_id] #2
+ ReusedExchange [i_item_sk,i_item_id] #4
InputAdapter
- ReusedExchange [d_date_sk] #3
+ ReusedExchange [d_date_sk] #2
InputAdapter
BroadcastExchange #7
WholeStageCodegen (14)
@@ -87,7 +90,8 @@ TakeOrderedAndProject [item_id,ss_item_rev,ss_dev,cs_item_rev,cs_dev,ws_item_rev
ColumnarToRow
InputAdapter
Scan parquet default.web_sales [ws_item_sk,ws_ext_sales_price,ws_sold_date_sk]
+ ReusedSubquery [d_date_sk] #1
InputAdapter
- ReusedExchange [i_item_sk,i_item_id] #2
+ ReusedExchange [i_item_sk,i_item_id] #4
InputAdapter
- ReusedExchange [d_date_sk] #3
+ ReusedExchange [d_date_sk] #2
diff --git a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q83.sf100/explain.txt b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q83.sf100/explain.txt
index 7f0868b..175a1c6 100644
--- a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q83.sf100/explain.txt
+++ b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q83.sf100/explain.txt
@@ -1,72 +1,57 @@
== Physical Plan ==
-TakeOrderedAndProject (61)
-+- * Project (60)
- +- * BroadcastHashJoin Inner BuildRight (59)
- :- * Project (45)
- : +- * BroadcastHashJoin Inner BuildRight (44)
- : :- * HashAggregate (30)
- : : +- Exchange (29)
- : : +- * HashAggregate (28)
- : : +- * Project (27)
- : : +- * BroadcastHashJoin Inner BuildRight (26)
- : : :- * Project (21)
- : : : +- * BroadcastHashJoin Inner BuildRight (20)
+TakeOrderedAndProject (46)
++- * Project (45)
+ +- * BroadcastHashJoin Inner BuildRight (44)
+ :- * Project (30)
+ : +- * BroadcastHashJoin Inner BuildRight (29)
+ : :- * HashAggregate (15)
+ : : +- Exchange (14)
+ : : +- * HashAggregate (13)
+ : : +- * Project (12)
+ : : +- * BroadcastHashJoin Inner BuildRight (11)
+ : : :- * Project (6)
+ : : : +- * BroadcastHashJoin Inner BuildRight (5)
: : : :- * Filter (3)
: : : : +- * ColumnarToRow (2)
: : : : +- Scan parquet default.store_returns (1)
- : : : +- BroadcastExchange (19)
- : : : +- * Project (18)
- : : : +- * BroadcastHashJoin LeftSemi BuildRight (17)
- : : : :- * Filter (6)
- : : : : +- * ColumnarToRow (5)
- : : : : +- Scan parquet default.date_dim (4)
- : : : +- BroadcastExchange (16)
- : : : +- * Project (15)
- : : : +- * BroadcastHashJoin LeftSemi BuildRight (14)
- : : : :- * ColumnarToRow (8)
- : : : : +- Scan parquet default.date_dim (7)
- : : : +- BroadcastExchange (13)
- : : : +- * Project (12)
- : : : +- * Filter (11)
- : : : +- * ColumnarToRow (10)
- : : : +- Scan parquet default.date_dim (9)
- : : +- BroadcastExchange (25)
- : : +- * Filter (24)
- : : +- * ColumnarToRow (23)
- : : +- Scan parquet default.item (22)
- : +- BroadcastExchange (43)
- : +- * HashAggregate (42)
- : +- Exchange (41)
- : +- * HashAggregate (40)
- : +- * Project (39)
- : +- * BroadcastHashJoin Inner BuildRight (38)
- : :- * Project (36)
- : : +- * BroadcastHashJoin Inner BuildRight (35)
- : : :- * Filter (33)
- : : : +- * ColumnarToRow (32)
- : : : +- Scan parquet default.catalog_returns (31)
- : : +- ReusedExchange (34)
- : +- ReusedExchange (37)
- +- BroadcastExchange (58)
- +- * HashAggregate (57)
- +- Exchange (56)
- +- * HashAggregate (55)
- +- * Project (54)
- +- * BroadcastHashJoin Inner BuildRight (53)
- :- * Project (51)
- : +- * BroadcastHashJoin Inner BuildRight (50)
- : :- * Filter (48)
- : : +- * ColumnarToRow (47)
- : : +- Scan parquet default.web_returns (46)
- : +- ReusedExchange (49)
- +- ReusedExchange (52)
+ : : : +- ReusedExchange (4)
+ : : +- BroadcastExchange (10)
+ : : +- * Filter (9)
+ : : +- * ColumnarToRow (8)
+ : : +- Scan parquet default.item (7)
+ : +- BroadcastExchange (28)
+ : +- * HashAggregate (27)
+ : +- Exchange (26)
+ : +- * HashAggregate (25)
+ : +- * Project (24)
+ : +- * BroadcastHashJoin Inner BuildRight (23)
+ : :- * Project (21)
+ : : +- * BroadcastHashJoin Inner BuildRight (20)
+ : : :- * Filter (18)
+ : : : +- * ColumnarToRow (17)
+ : : : +- Scan parquet default.catalog_returns (16)
+ : : +- ReusedExchange (19)
+ : +- ReusedExchange (22)
+ +- BroadcastExchange (43)
+ +- * HashAggregate (42)
+ +- Exchange (41)
+ +- * HashAggregate (40)
+ +- * Project (39)
+ +- * BroadcastHashJoin Inner BuildRight (38)
+ :- * Project (36)
+ : +- * BroadcastHashJoin Inner BuildRight (35)
+ : :- * Filter (33)
+ : : +- * ColumnarToRow (32)
+ : : +- Scan parquet default.web_returns (31)
+ : +- ReusedExchange (34)
+ +- ReusedExchange (37)
(1) Scan parquet default.store_returns
Output [3]: [sr_item_sk#1, sr_return_quantity#2, sr_returned_date_sk#3]
Batched: true
Location: InMemoryFileIndex []
-PartitionFilters: [isnotnull(sr_returned_date_sk#3), dynamicpruningexpression(true)]
+PartitionFilters: [isnotnull(sr_returned_date_sk#3), dynamicpruningexpression(sr_returned_date_sk#3 IN dynamicpruning#4)]
PushedFilters: [IsNotNull(sr_item_sk)]
ReadSchema: struct<sr_item_sk:int,sr_return_quantity:int>
@@ -77,271 +62,300 @@ Input [3]: [sr_item_sk#1, sr_return_quantity#2, sr_returned_date_sk#3]
Input [3]: [sr_item_sk#1, sr_return_quantity#2, sr_returned_date_sk#3]
Condition : isnotnull(sr_item_sk#1)
-(4) Scan parquet default.date_dim
-Output [2]: [d_date_sk#4, d_date#5]
-Batched: true
-Location [not included in comparison]/{warehouse_dir}/date_dim]
-PushedFilters: [IsNotNull(d_date_sk)]
-ReadSchema: struct<d_date_sk:int,d_date:date>
-
-(5) ColumnarToRow [codegen id : 3]
-Input [2]: [d_date_sk#4, d_date#5]
-
-(6) Filter [codegen id : 3]
-Input [2]: [d_date_sk#4, d_date#5]
-Condition : isnotnull(d_date_sk#4)
-
-(7) Scan parquet default.date_dim
-Output [2]: [d_date#6, d_week_seq#7]
-Batched: true
-Location [not included in comparison]/{warehouse_dir}/date_dim]
-ReadSchema: struct<d_date:date,d_week_seq:int>
-
-(8) ColumnarToRow [codegen id : 2]
-Input [2]: [d_date#6, d_week_seq#7]
-
-(9) Scan parquet default.date_dim
-Output [2]: [d_date#8, d_week_seq#9]
-Batched: true
-Location [not included in comparison]/{warehouse_dir}/date_dim]
-ReadSchema: struct<d_date:date,d_week_seq:int>
-
-(10) ColumnarToRow [codegen id : 1]
-Input [2]: [d_date#8, d_week_seq#9]
+(4) ReusedExchange [Reuses operator id: 62]
+Output [1]: [d_date_sk#5]
-(11) Filter [codegen id : 1]
-Input [2]: [d_date#8, d_week_seq#9]
-Condition : cast(d_date#8 as string) IN (2000-06-30,2000-09-27,2000-11-17)
-
-(12) Project [codegen id : 1]
-Output [1]: [d_week_seq#9]
-Input [2]: [d_date#8, d_week_seq#9]
-
-(13) BroadcastExchange
-Input [1]: [d_week_seq#9]
-Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#10]
-
-(14) BroadcastHashJoin [codegen id : 2]
-Left keys [1]: [d_week_seq#7]
-Right keys [1]: [d_week_seq#9]
-Join condition: None
-
-(15) Project [codegen id : 2]
-Output [1]: [d_date#6]
-Input [2]: [d_date#6, d_week_seq#7]
-
-(16) BroadcastExchange
-Input [1]: [d_date#6]
-Arguments: HashedRelationBroadcastMode(List(input[0, date, true]),false), [id=#11]
-
-(17) BroadcastHashJoin [codegen id : 3]
-Left keys [1]: [d_date#5]
-Right keys [1]: [d_date#6]
-Join condition: None
-
-(18) Project [codegen id : 3]
-Output [1]: [d_date_sk#4]
-Input [2]: [d_date_sk#4, d_date#5]
-
-(19) BroadcastExchange
-Input [1]: [d_date_sk#4]
-Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#12]
-
-(20) BroadcastHashJoin [codegen id : 5]
+(5) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [sr_returned_date_sk#3]
-Right keys [1]: [d_date_sk#4]
+Right keys [1]: [d_date_sk#5]
Join condition: None
-(21) Project [codegen id : 5]
+(6) Project [codegen id : 5]
Output [2]: [sr_item_sk#1, sr_return_quantity#2]
-Input [4]: [sr_item_sk#1, sr_return_quantity#2, sr_returned_date_sk#3, d_date_sk#4]
+Input [4]: [sr_item_sk#1, sr_return_quantity#2, sr_returned_date_sk#3, d_date_sk#5]
-(22) Scan parquet default.item
-Output [2]: [i_item_sk#13, i_item_id#14]
+(7) Scan parquet default.item
+Output [2]: [i_item_sk#6, i_item_id#7]
Batched: true
Location [not included in comparison]/{warehouse_dir}/item]
PushedFilters: [IsNotNull(i_item_sk), IsNotNull(i_item_id)]
ReadSchema: struct<i_item_sk:int,i_item_id:string>
-(23) ColumnarToRow [codegen id : 4]
-Input [2]: [i_item_sk#13, i_item_id#14]
+(8) ColumnarToRow [codegen id : 4]
+Input [2]: [i_item_sk#6, i_item_id#7]
-(24) Filter [codegen id : 4]
-Input [2]: [i_item_sk#13, i_item_id#14]
-Condition : (isnotnull(i_item_sk#13) AND isnotnull(i_item_id#14))
+(9) Filter [codegen id : 4]
+Input [2]: [i_item_sk#6, i_item_id#7]
+Condition : (isnotnull(i_item_sk#6) AND isnotnull(i_item_id#7))
-(25) BroadcastExchange
-Input [2]: [i_item_sk#13, i_item_id#14]
-Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#15]
+(10) BroadcastExchange
+Input [2]: [i_item_sk#6, i_item_id#7]
+Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#8]
-(26) BroadcastHashJoin [codegen id : 5]
+(11) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [sr_item_sk#1]
-Right keys [1]: [i_item_sk#13]
+Right keys [1]: [i_item_sk#6]
Join condition: None
-(27) Project [codegen id : 5]
-Output [2]: [sr_return_quantity#2, i_item_id#14]
-Input [4]: [sr_item_sk#1, sr_return_quantity#2, i_item_sk#13, i_item_id#14]
+(12) Project [codegen id : 5]
+Output [2]: [sr_return_quantity#2, i_item_id#7]
+Input [4]: [sr_item_sk#1, sr_return_quantity#2, i_item_sk#6, i_item_id#7]
-(28) HashAggregate [codegen id : 5]
-Input [2]: [sr_return_quantity#2, i_item_id#14]
-Keys [1]: [i_item_id#14]
+(13) HashAggregate [codegen id : 5]
+Input [2]: [sr_return_quantity#2, i_item_id#7]
+Keys [1]: [i_item_id#7]
Functions [1]: [partial_sum(sr_return_quantity#2)]
-Aggregate Attributes [1]: [sum#16]
-Results [2]: [i_item_id#14, sum#17]
+Aggregate Attributes [1]: [sum#9]
+Results [2]: [i_item_id#7, sum#10]
-(29) Exchange
-Input [2]: [i_item_id#14, sum#17]
-Arguments: hashpartitioning(i_item_id#14, 5), ENSURE_REQUIREMENTS, [id=#18]
+(14) Exchange
+Input [2]: [i_item_id#7, sum#10]
+Arguments: hashpartitioning(i_item_id#7, 5), ENSURE_REQUIREMENTS, [id=#11]
-(30) HashAggregate [codegen id : 18]
-Input [2]: [i_item_id#14, sum#17]
-Keys [1]: [i_item_id#14]
+(15) HashAggregate [codegen id : 18]
+Input [2]: [i_item_id#7, sum#10]
+Keys [1]: [i_item_id#7]
Functions [1]: [sum(sr_return_quantity#2)]
-Aggregate Attributes [1]: [sum(sr_return_quantity#2)#19]
-Results [2]: [i_item_id#14 AS item_id#20, sum(sr_return_quantity#2)#19 AS sr_item_qty#21]
+Aggregate Attributes [1]: [sum(sr_return_quantity#2)#12]
+Results [2]: [i_item_id#7 AS item_id#13, sum(sr_return_quantity#2)#12 AS sr_item_qty#14]
-(31) Scan parquet default.catalog_returns
-Output [3]: [cr_item_sk#22, cr_return_quantity#23, cr_returned_date_sk#24]
+(16) Scan parquet default.catalog_returns
+Output [3]: [cr_item_sk#15, cr_return_quantity#16, cr_returned_date_sk#17]
Batched: true
Location: InMemoryFileIndex []
-PartitionFilters: [isnotnull(cr_returned_date_sk#24), dynamicpruningexpression(true)]
+PartitionFilters: [isnotnull(cr_returned_date_sk#17), dynamicpruningexpression(cr_returned_date_sk#17 IN dynamicpruning#4)]
PushedFilters: [IsNotNull(cr_item_sk)]
ReadSchema: struct<cr_item_sk:int,cr_return_quantity:int>
-(32) ColumnarToRow [codegen id : 10]
-Input [3]: [cr_item_sk#22, cr_return_quantity#23, cr_returned_date_sk#24]
+(17) ColumnarToRow [codegen id : 10]
+Input [3]: [cr_item_sk#15, cr_return_quantity#16, cr_returned_date_sk#17]
+
+(18) Filter [codegen id : 10]
+Input [3]: [cr_item_sk#15, cr_return_quantity#16, cr_returned_date_sk#17]
+Condition : isnotnull(cr_item_sk#15)
+
+(19) ReusedExchange [Reuses operator id: 62]
+Output [1]: [d_date_sk#18]
+
+(20) BroadcastHashJoin [codegen id : 10]
+Left keys [1]: [cr_returned_date_sk#17]
+Right keys [1]: [d_date_sk#18]
+Join condition: None
+
+(21) Project [codegen id : 10]
+Output [2]: [cr_item_sk#15, cr_return_quantity#16]
+Input [4]: [cr_item_sk#15, cr_return_quantity#16, cr_returned_date_sk#17, d_date_sk#18]
-(33) Filter [codegen id : 10]
-Input [3]: [cr_item_sk#22, cr_return_quantity#23, cr_returned_date_sk#24]
-Condition : isnotnull(cr_item_sk#22)
+(22) ReusedExchange [Reuses operator id: 10]
+Output [2]: [i_item_sk#19, i_item_id#20]
-(34) ReusedExchange [Reuses operator id: 19]
-Output [1]: [d_date_sk#25]
+(23) BroadcastHashJoin [codegen id : 10]
+Left keys [1]: [cr_item_sk#15]
+Right keys [1]: [i_item_sk#19]
+Join condition: None
+
+(24) Project [codegen id : 10]
+Output [2]: [cr_return_quantity#16, i_item_id#20]
+Input [4]: [cr_item_sk#15, cr_return_quantity#16, i_item_sk#19, i_item_id#20]
+
+(25) HashAggregate [codegen id : 10]
+Input [2]: [cr_return_quantity#16, i_item_id#20]
+Keys [1]: [i_item_id#20]
+Functions [1]: [partial_sum(cr_return_quantity#16)]
+Aggregate Attributes [1]: [sum#21]
+Results [2]: [i_item_id#20, sum#22]
+
+(26) Exchange
+Input [2]: [i_item_id#20, sum#22]
+Arguments: hashpartitioning(i_item_id#20, 5), ENSURE_REQUIREMENTS, [id=#23]
+
+(27) HashAggregate [codegen id : 11]
+Input [2]: [i_item_id#20, sum#22]
+Keys [1]: [i_item_id#20]
+Functions [1]: [sum(cr_return_quantity#16)]
+Aggregate Attributes [1]: [sum(cr_return_quantity#16)#24]
+Results [2]: [i_item_id#20 AS item_id#25, sum(cr_return_quantity#16)#24 AS cr_item_qty#26]
+
+(28) BroadcastExchange
+Input [2]: [item_id#25, cr_item_qty#26]
+Arguments: HashedRelationBroadcastMode(List(input[0, string, true]),false), [id=#27]
+
+(29) BroadcastHashJoin [codegen id : 18]
+Left keys [1]: [item_id#13]
+Right keys [1]: [item_id#25]
+Join condition: None
+
+(30) Project [codegen id : 18]
+Output [3]: [item_id#13, sr_item_qty#14, cr_item_qty#26]
+Input [4]: [item_id#13, sr_item_qty#14, item_id#25, cr_item_qty#26]
+
+(31) Scan parquet default.web_returns
+Output [3]: [wr_item_sk#28, wr_return_quantity#29, wr_returned_date_sk#30]
+Batched: true
+Location: InMemoryFileIndex []
+PartitionFilters: [isnotnull(wr_returned_date_sk#30), dynamicpruningexpression(wr_returned_date_sk#30 IN dynamicpruning#4)]
+PushedFilters: [IsNotNull(wr_item_sk)]
+ReadSchema: struct<wr_item_sk:int,wr_return_quantity:int>
+
+(32) ColumnarToRow [codegen id : 16]
+Input [3]: [wr_item_sk#28, wr_return_quantity#29, wr_returned_date_sk#30]
+
+(33) Filter [codegen id : 16]
+Input [3]: [wr_item_sk#28, wr_return_quantity#29, wr_returned_date_sk#30]
+Condition : isnotnull(wr_item_sk#28)
+
+(34) ReusedExchange [Reuses operator id: 62]
+Output [1]: [d_date_sk#31]
-(35) BroadcastHashJoin [codegen id : 10]
-Left keys [1]: [cr_returned_date_sk#24]
-Right keys [1]: [d_date_sk#25]
+(35) BroadcastHashJoin [codegen id : 16]
+Left keys [1]: [wr_returned_date_sk#30]
+Right keys [1]: [d_date_sk#31]
Join condition: None
-(36) Project [codegen id : 10]
-Output [2]: [cr_item_sk#22, cr_return_quantity#23]
-Input [4]: [cr_item_sk#22, cr_return_quantity#23, cr_returned_date_sk#24, d_date_sk#25]
+(36) Project [codegen id : 16]
+Output [2]: [wr_item_sk#28, wr_return_quantity#29]
+Input [4]: [wr_item_sk#28, wr_return_quantity#29, wr_returned_date_sk#30, d_date_sk#31]
-(37) ReusedExchange [Reuses operator id: 25]
-Output [2]: [i_item_sk#26, i_item_id#27]
+(37) ReusedExchange [Reuses operator id: 10]
+Output [2]: [i_item_sk#32, i_item_id#33]
-(38) BroadcastHashJoin [codegen id : 10]
-Left keys [1]: [cr_item_sk#22]
-Right keys [1]: [i_item_sk#26]
+(38) BroadcastHashJoin [codegen id : 16]
+Left keys [1]: [wr_item_sk#28]
+Right keys [1]: [i_item_sk#32]
Join condition: None
-(39) Project [codegen id : 10]
-Output [2]: [cr_return_quantity#23, i_item_id#27]
-Input [4]: [cr_item_sk#22, cr_return_quantity#23, i_item_sk#26, i_item_id#27]
+(39) Project [codegen id : 16]
+Output [2]: [wr_return_quantity#29, i_item_id#33]
+Input [4]: [wr_item_sk#28, wr_return_quantity#29, i_item_sk#32, i_item_id#33]
-(40) HashAggregate [codegen id : 10]
-Input [2]: [cr_return_quantity#23, i_item_id#27]
-Keys [1]: [i_item_id#27]
-Functions [1]: [partial_sum(cr_return_quantity#23)]
-Aggregate Attributes [1]: [sum#28]
-Results [2]: [i_item_id#27, sum#29]
+(40) HashAggregate [codegen id : 16]
+Input [2]: [wr_return_quantity#29, i_item_id#33]
+Keys [1]: [i_item_id#33]
+Functions [1]: [partial_sum(wr_return_quantity#29)]
+Aggregate Attributes [1]: [sum#34]
+Results [2]: [i_item_id#33, sum#35]
(41) Exchange
-Input [2]: [i_item_id#27, sum#29]
-Arguments: hashpartitioning(i_item_id#27, 5), ENSURE_REQUIREMENTS, [id=#30]
+Input [2]: [i_item_id#33, sum#35]
+Arguments: hashpartitioning(i_item_id#33, 5), ENSURE_REQUIREMENTS, [id=#36]
-(42) HashAggregate [codegen id : 11]
-Input [2]: [i_item_id#27, sum#29]
-Keys [1]: [i_item_id#27]
-Functions [1]: [sum(cr_return_quantity#23)]
-Aggregate Attributes [1]: [sum(cr_return_quantity#23)#31]
-Results [2]: [i_item_id#27 AS item_id#32, sum(cr_return_quantity#23)#31 AS cr_item_qty#33]
+(42) HashAggregate [codegen id : 17]
+Input [2]: [i_item_id#33, sum#35]
+Keys [1]: [i_item_id#33]
+Functions [1]: [sum(wr_return_quantity#29)]
+Aggregate Attributes [1]: [sum(wr_return_quantity#29)#37]
+Results [2]: [i_item_id#33 AS item_id#38, sum(wr_return_quantity#29)#37 AS wr_item_qty#39]
(43) BroadcastExchange
-Input [2]: [item_id#32, cr_item_qty#33]
-Arguments: HashedRelationBroadcastMode(List(input[0, string, true]),false), [id=#34]
+Input [2]: [item_id#38, wr_item_qty#39]
+Arguments: HashedRelationBroadcastMode(List(input[0, string, true]),false), [id=#40]
(44) BroadcastHashJoin [codegen id : 18]
-Left keys [1]: [item_id#20]
-Right keys [1]: [item_id#32]
+Left keys [1]: [item_id#13]
+Right keys [1]: [item_id#38]
Join condition: None
(45) Project [codegen id : 18]
-Output [3]: [item_id#20, sr_item_qty#21, cr_item_qty#33]
-Input [4]: [item_id#20, sr_item_qty#21, item_id#32, cr_item_qty#33]
+Output [8]: [item_id#13, sr_item_qty#14, (((cast(sr_item_qty#14 as double) / cast(((sr_item_qty#14 + cr_item_qty#26) + wr_item_qty#39) as double)) / 3.0) * 100.0) AS sr_dev#41, cr_item_qty#26, (((cast(cr_item_qty#26 as double) / cast(((sr_item_qty#14 + cr_item_qty#26) + wr_item_qty#39) as double)) / 3.0) * 100.0) AS cr_dev#42, wr_item_qty#39, (((cast(wr_item_qty#39 as double) / cast(((sr_item_qty#14 + cr_item_qty#26) + wr_item_qty#39) as double)) / 3.0) * 100.0) AS wr_dev#43, CheckOverfl [...]
+Input [5]: [item_id#13, sr_item_qty#14, cr_item_qty#26, item_id#38, wr_item_qty#39]
+
+(46) TakeOrderedAndProject
+Input [8]: [item_id#13, sr_item_qty#14, sr_dev#41, cr_item_qty#26, cr_dev#42, wr_item_qty#39, wr_dev#43, average#44]
+Arguments: 100, [item_id#13 ASC NULLS FIRST, sr_item_qty#14 ASC NULLS FIRST], [item_id#13, sr_item_qty#14, sr_dev#41, cr_item_qty#26, cr_dev#42, wr_item_qty#39, wr_dev#43, average#44]
+
+===== Subqueries =====
+
+Subquery:1 Hosting operator id = 1 Hosting Expression = sr_returned_date_sk#3 IN dynamicpruning#4
+BroadcastExchange (62)
++- * Project (61)
+ +- * BroadcastHashJoin LeftSemi BuildRight (60)
+ :- * Filter (49)
+ : +- * ColumnarToRow (48)
+ : +- Scan parquet default.date_dim (47)
+ +- BroadcastExchange (59)
+ +- * Project (58)
+ +- * BroadcastHashJoin LeftSemi BuildRight (57)
+ :- * ColumnarToRow (51)
+ : +- Scan parquet default.date_dim (50)
+ +- BroadcastExchange (56)
+ +- * Project (55)
+ +- * Filter (54)
+ +- * ColumnarToRow (53)
+ +- Scan parquet default.date_dim (52)
+
+
+(47) Scan parquet default.date_dim
+Output [2]: [d_date_sk#5, d_date#45]
+Batched: true
+Location [not included in comparison]/{warehouse_dir}/date_dim]
+PushedFilters: [IsNotNull(d_date_sk)]
+ReadSchema: struct<d_date_sk:int,d_date:date>
-(46) Scan parquet default.web_returns
-Output [3]: [wr_item_sk#35, wr_return_quantity#36, wr_returned_date_sk#37]
+(48) ColumnarToRow [codegen id : 3]
+Input [2]: [d_date_sk#5, d_date#45]
+
+(49) Filter [codegen id : 3]
+Input [2]: [d_date_sk#5, d_date#45]
+Condition : isnotnull(d_date_sk#5)
+
+(50) Scan parquet default.date_dim
+Output [2]: [d_date#46, d_week_seq#47]
Batched: true
-Location: InMemoryFileIndex []
-PartitionFilters: [isnotnull(wr_returned_date_sk#37), dynamicpruningexpression(true)]
-PushedFilters: [IsNotNull(wr_item_sk)]
-ReadSchema: struct<wr_item_sk:int,wr_return_quantity:int>
+Location [not included in comparison]/{warehouse_dir}/date_dim]
+ReadSchema: struct<d_date:date,d_week_seq:int>
+
+(51) ColumnarToRow [codegen id : 2]
+Input [2]: [d_date#46, d_week_seq#47]
-(47) ColumnarToRow [codegen id : 16]
-Input [3]: [wr_item_sk#35, wr_return_quantity#36, wr_returned_date_sk#37]
+(52) Scan parquet default.date_dim
+Output [2]: [d_date#48, d_week_seq#49]
+Batched: true
+Location [not included in comparison]/{warehouse_dir}/date_dim]
+ReadSchema: struct<d_date:date,d_week_seq:int>
+
+(53) ColumnarToRow [codegen id : 1]
+Input [2]: [d_date#48, d_week_seq#49]
-(48) Filter [codegen id : 16]
-Input [3]: [wr_item_sk#35, wr_return_quantity#36, wr_returned_date_sk#37]
-Condition : isnotnull(wr_item_sk#35)
+(54) Filter [codegen id : 1]
+Input [2]: [d_date#48, d_week_seq#49]
+Condition : cast(d_date#48 as string) IN (2000-06-30,2000-09-27,2000-11-17)
-(49) ReusedExchange [Reuses operator id: 19]
-Output [1]: [d_date_sk#38]
+(55) Project [codegen id : 1]
+Output [1]: [d_week_seq#49]
+Input [2]: [d_date#48, d_week_seq#49]
-(50) BroadcastHashJoin [codegen id : 16]
-Left keys [1]: [wr_returned_date_sk#37]
-Right keys [1]: [d_date_sk#38]
+(56) BroadcastExchange
+Input [1]: [d_week_seq#49]
+Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#50]
+
+(57) BroadcastHashJoin [codegen id : 2]
+Left keys [1]: [d_week_seq#47]
+Right keys [1]: [d_week_seq#49]
Join condition: None
-(51) Project [codegen id : 16]
-Output [2]: [wr_item_sk#35, wr_return_quantity#36]
-Input [4]: [wr_item_sk#35, wr_return_quantity#36, wr_returned_date_sk#37, d_date_sk#38]
+(58) Project [codegen id : 2]
+Output [1]: [d_date#46]
+Input [2]: [d_date#46, d_week_seq#47]
-(52) ReusedExchange [Reuses operator id: 25]
-Output [2]: [i_item_sk#39, i_item_id#40]
+(59) BroadcastExchange
+Input [1]: [d_date#46]
+Arguments: HashedRelationBroadcastMode(List(input[0, date, true]),false), [id=#51]
-(53) BroadcastHashJoin [codegen id : 16]
-Left keys [1]: [wr_item_sk#35]
-Right keys [1]: [i_item_sk#39]
+(60) BroadcastHashJoin [codegen id : 3]
+Left keys [1]: [d_date#45]
+Right keys [1]: [d_date#46]
Join condition: None
-(54) Project [codegen id : 16]
-Output [2]: [wr_return_quantity#36, i_item_id#40]
-Input [4]: [wr_item_sk#35, wr_return_quantity#36, i_item_sk#39, i_item_id#40]
-
-(55) HashAggregate [codegen id : 16]
-Input [2]: [wr_return_quantity#36, i_item_id#40]
-Keys [1]: [i_item_id#40]
-Functions [1]: [partial_sum(wr_return_quantity#36)]
-Aggregate Attributes [1]: [sum#41]
-Results [2]: [i_item_id#40, sum#42]
-
-(56) Exchange
-Input [2]: [i_item_id#40, sum#42]
-Arguments: hashpartitioning(i_item_id#40, 5), ENSURE_REQUIREMENTS, [id=#43]
-
-(57) HashAggregate [codegen id : 17]
-Input [2]: [i_item_id#40, sum#42]
-Keys [1]: [i_item_id#40]
-Functions [1]: [sum(wr_return_quantity#36)]
-Aggregate Attributes [1]: [sum(wr_return_quantity#36)#44]
-Results [2]: [i_item_id#40 AS item_id#45, sum(wr_return_quantity#36)#44 AS wr_item_qty#46]
-
-(58) BroadcastExchange
-Input [2]: [item_id#45, wr_item_qty#46]
-Arguments: HashedRelationBroadcastMode(List(input[0, string, true]),false), [id=#47]
-
-(59) BroadcastHashJoin [codegen id : 18]
-Left keys [1]: [item_id#20]
-Right keys [1]: [item_id#45]
-Join condition: None
+(61) Project [codegen id : 3]
+Output [1]: [d_date_sk#5]
+Input [2]: [d_date_sk#5, d_date#45]
+
+(62) BroadcastExchange
+Input [1]: [d_date_sk#5]
+Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#52]
+
+Subquery:2 Hosting operator id = 16 Hosting Expression = cr_returned_date_sk#17 IN dynamicpruning#4
-(60) Project [codegen id : 18]
-Output [8]: [item_id#20, sr_item_qty#21, (((cast(sr_item_qty#21 as double) / cast(((sr_item_qty#21 + cr_item_qty#33) + wr_item_qty#46) as double)) / 3.0) * 100.0) AS sr_dev#48, cr_item_qty#33, (((cast(cr_item_qty#33 as double) / cast(((sr_item_qty#21 + cr_item_qty#33) + wr_item_qty#46) as double)) / 3.0) * 100.0) AS cr_dev#49, wr_item_qty#46, (((cast(wr_item_qty#46 as double) / cast(((sr_item_qty#21 + cr_item_qty#33) + wr_item_qty#46) as double)) / 3.0) * 100.0) AS wr_dev#50, CheckOverfl [...]
-Input [5]: [item_id#20, sr_item_qty#21, cr_item_qty#33, item_id#45, wr_item_qty#46]
+Subquery:3 Hosting operator id = 31 Hosting Expression = wr_returned_date_sk#30 IN dynamicpruning#4
-(61) TakeOrderedAndProject
-Input [8]: [item_id#20, sr_item_qty#21, sr_dev#48, cr_item_qty#33, cr_dev#49, wr_item_qty#46, wr_dev#50, average#51]
-Arguments: 100, [item_id#20 ASC NULLS FIRST, sr_item_qty#21 ASC NULLS FIRST], [item_id#20, sr_item_qty#21, sr_dev#48, cr_item_qty#33, cr_dev#49, wr_item_qty#46, wr_dev#50, average#51]
diff --git a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q83.sf100/simplified.txt b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q83.sf100/simplified.txt
index b46131a..7f38503 100644
--- a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q83.sf100/simplified.txt
+++ b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q83.sf100/simplified.txt
@@ -17,31 +17,33 @@ TakeOrderedAndProject [item_id,sr_item_qty,sr_dev,cr_item_qty,cr_dev,wr_item_qty
ColumnarToRow
InputAdapter
Scan parquet default.store_returns [sr_item_sk,sr_return_quantity,sr_returned_date_sk]
- InputAdapter
- BroadcastExchange #2
- WholeStageCodegen (3)
- Project [d_date_sk]
- BroadcastHashJoin [d_date,d_date]
- Filter [d_date_sk]
- ColumnarToRow
- InputAdapter
- Scan parquet default.date_dim [d_date_sk,d_date]
- InputAdapter
- BroadcastExchange #3
- WholeStageCodegen (2)
- Project [d_date]
- BroadcastHashJoin [d_week_seq,d_week_seq]
+ SubqueryBroadcast [d_date_sk] #1
+ BroadcastExchange #2
+ WholeStageCodegen (3)
+ Project [d_date_sk]
+ BroadcastHashJoin [d_date,d_date]
+ Filter [d_date_sk]
ColumnarToRow
InputAdapter
- Scan parquet default.date_dim [d_date,d_week_seq]
- InputAdapter
- BroadcastExchange #4
- WholeStageCodegen (1)
- Project [d_week_seq]
- Filter [d_date]
- ColumnarToRow
- InputAdapter
- Scan parquet default.date_dim [d_date,d_week_seq]
+ Scan parquet default.date_dim [d_date_sk,d_date]
+ InputAdapter
+ BroadcastExchange #3
+ WholeStageCodegen (2)
+ Project [d_date]
+ BroadcastHashJoin [d_week_seq,d_week_seq]
+ ColumnarToRow
+ InputAdapter
+ Scan parquet default.date_dim [d_date,d_week_seq]
+ InputAdapter
+ BroadcastExchange #4
+ WholeStageCodegen (1)
+ Project [d_week_seq]
+ Filter [d_date]
+ ColumnarToRow
+ InputAdapter
+ Scan parquet default.date_dim [d_date,d_week_seq]
+ InputAdapter
+ ReusedExchange [d_date_sk] #2
InputAdapter
BroadcastExchange #5
WholeStageCodegen (4)
@@ -65,6 +67,7 @@ TakeOrderedAndProject [item_id,sr_item_qty,sr_dev,cr_item_qty,cr_dev,wr_item_qty
ColumnarToRow
InputAdapter
Scan parquet default.catalog_returns [cr_item_sk,cr_return_quantity,cr_returned_date_sk]
+ ReusedSubquery [d_date_sk] #1
InputAdapter
ReusedExchange [d_date_sk] #2
InputAdapter
@@ -85,6 +88,7 @@ TakeOrderedAndProject [item_id,sr_item_qty,sr_dev,cr_item_qty,cr_dev,wr_item_qty
ColumnarToRow
InputAdapter
Scan parquet default.web_returns [wr_item_sk,wr_return_quantity,wr_returned_date_sk]
+ ReusedSubquery [d_date_sk] #1
InputAdapter
ReusedExchange [d_date_sk] #2
InputAdapter
diff --git a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q83/explain.txt b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q83/explain.txt
index 0764824..8332d48 100644
--- a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q83/explain.txt
+++ b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q83/explain.txt
@@ -1,14 +1,14 @@
== Physical Plan ==
-TakeOrderedAndProject (61)
-+- * Project (60)
- +- * BroadcastHashJoin Inner BuildRight (59)
- :- * Project (45)
- : +- * BroadcastHashJoin Inner BuildRight (44)
- : :- * HashAggregate (30)
- : : +- Exchange (29)
- : : +- * HashAggregate (28)
- : : +- * Project (27)
- : : +- * BroadcastHashJoin Inner BuildRight (26)
+TakeOrderedAndProject (46)
++- * Project (45)
+ +- * BroadcastHashJoin Inner BuildRight (44)
+ :- * Project (30)
+ : +- * BroadcastHashJoin Inner BuildRight (29)
+ : :- * HashAggregate (15)
+ : : +- Exchange (14)
+ : : +- * HashAggregate (13)
+ : : +- * Project (12)
+ : : +- * BroadcastHashJoin Inner BuildRight (11)
: : :- * Project (9)
: : : +- * BroadcastHashJoin Inner BuildRight (8)
: : : :- * Filter (3)
@@ -18,55 +18,40 @@ TakeOrderedAndProject (61)
: : : +- * Filter (6)
: : : +- * ColumnarToRow (5)
: : : +- Scan parquet default.item (4)
- : : +- BroadcastExchange (25)
- : : +- * Project (24)
- : : +- * BroadcastHashJoin LeftSemi BuildRight (23)
- : : :- * Filter (12)
- : : : +- * ColumnarToRow (11)
- : : : +- Scan parquet default.date_dim (10)
- : : +- BroadcastExchange (22)
- : : +- * Project (21)
- : : +- * BroadcastHashJoin LeftSemi BuildRight (20)
- : : :- * ColumnarToRow (14)
- : : : +- Scan parquet default.date_dim (13)
- : : +- BroadcastExchange (19)
- : : +- * Project (18)
- : : +- * Filter (17)
- : : +- * ColumnarToRow (16)
- : : +- Scan parquet default.date_dim (15)
- : +- BroadcastExchange (43)
- : +- * HashAggregate (42)
- : +- Exchange (41)
- : +- * HashAggregate (40)
- : +- * Project (39)
- : +- * BroadcastHashJoin Inner BuildRight (38)
- : :- * Project (36)
- : : +- * BroadcastHashJoin Inner BuildRight (35)
- : : :- * Filter (33)
- : : : +- * ColumnarToRow (32)
- : : : +- Scan parquet default.catalog_returns (31)
- : : +- ReusedExchange (34)
- : +- ReusedExchange (37)
- +- BroadcastExchange (58)
- +- * HashAggregate (57)
- +- Exchange (56)
- +- * HashAggregate (55)
- +- * Project (54)
- +- * BroadcastHashJoin Inner BuildRight (53)
- :- * Project (51)
- : +- * BroadcastHashJoin Inner BuildRight (50)
- : :- * Filter (48)
- : : +- * ColumnarToRow (47)
- : : +- Scan parquet default.web_returns (46)
- : +- ReusedExchange (49)
- +- ReusedExchange (52)
+ : : +- ReusedExchange (10)
+ : +- BroadcastExchange (28)
+ : +- * HashAggregate (27)
+ : +- Exchange (26)
+ : +- * HashAggregate (25)
+ : +- * Project (24)
+ : +- * BroadcastHashJoin Inner BuildRight (23)
+ : :- * Project (21)
+ : : +- * BroadcastHashJoin Inner BuildRight (20)
+ : : :- * Filter (18)
+ : : : +- * ColumnarToRow (17)
+ : : : +- Scan parquet default.catalog_returns (16)
+ : : +- ReusedExchange (19)
+ : +- ReusedExchange (22)
+ +- BroadcastExchange (43)
+ +- * HashAggregate (42)
+ +- Exchange (41)
+ +- * HashAggregate (40)
+ +- * Project (39)
+ +- * BroadcastHashJoin Inner BuildRight (38)
+ :- * Project (36)
+ : +- * BroadcastHashJoin Inner BuildRight (35)
+ : :- * Filter (33)
+ : : +- * ColumnarToRow (32)
+ : : +- Scan parquet default.web_returns (31)
+ : +- ReusedExchange (34)
+ +- ReusedExchange (37)
(1) Scan parquet default.store_returns
Output [3]: [sr_item_sk#1, sr_return_quantity#2, sr_returned_date_sk#3]
Batched: true
Location: InMemoryFileIndex []
-PartitionFilters: [isnotnull(sr_returned_date_sk#3), dynamicpruningexpression(true)]
+PartitionFilters: [isnotnull(sr_returned_date_sk#3), dynamicpruningexpression(sr_returned_date_sk#3 IN dynamicpruning#4)]
PushedFilters: [IsNotNull(sr_item_sk)]
ReadSchema: struct<sr_item_sk:int,sr_return_quantity:int>
@@ -78,270 +63,299 @@ Input [3]: [sr_item_sk#1, sr_return_quantity#2, sr_returned_date_sk#3]
Condition : isnotnull(sr_item_sk#1)
(4) Scan parquet default.item
-Output [2]: [i_item_sk#4, i_item_id#5]
+Output [2]: [i_item_sk#5, i_item_id#6]
Batched: true
Location [not included in comparison]/{warehouse_dir}/item]
PushedFilters: [IsNotNull(i_item_sk), IsNotNull(i_item_id)]
ReadSchema: struct<i_item_sk:int,i_item_id:string>
(5) ColumnarToRow [codegen id : 1]
-Input [2]: [i_item_sk#4, i_item_id#5]
+Input [2]: [i_item_sk#5, i_item_id#6]
(6) Filter [codegen id : 1]
-Input [2]: [i_item_sk#4, i_item_id#5]
-Condition : (isnotnull(i_item_sk#4) AND isnotnull(i_item_id#5))
+Input [2]: [i_item_sk#5, i_item_id#6]
+Condition : (isnotnull(i_item_sk#5) AND isnotnull(i_item_id#6))
(7) BroadcastExchange
-Input [2]: [i_item_sk#4, i_item_id#5]
-Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#6]
+Input [2]: [i_item_sk#5, i_item_id#6]
+Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#7]
(8) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [sr_item_sk#1]
-Right keys [1]: [i_item_sk#4]
+Right keys [1]: [i_item_sk#5]
Join condition: None
(9) Project [codegen id : 5]
-Output [3]: [sr_return_quantity#2, sr_returned_date_sk#3, i_item_id#5]
-Input [5]: [sr_item_sk#1, sr_return_quantity#2, sr_returned_date_sk#3, i_item_sk#4, i_item_id#5]
+Output [3]: [sr_return_quantity#2, sr_returned_date_sk#3, i_item_id#6]
+Input [5]: [sr_item_sk#1, sr_return_quantity#2, sr_returned_date_sk#3, i_item_sk#5, i_item_id#6]
-(10) Scan parquet default.date_dim
-Output [2]: [d_date_sk#7, d_date#8]
-Batched: true
-Location [not included in comparison]/{warehouse_dir}/date_dim]
-PushedFilters: [IsNotNull(d_date_sk)]
-ReadSchema: struct<d_date_sk:int,d_date:date>
+(10) ReusedExchange [Reuses operator id: 62]
+Output [1]: [d_date_sk#8]
-(11) ColumnarToRow [codegen id : 4]
-Input [2]: [d_date_sk#7, d_date#8]
+(11) BroadcastHashJoin [codegen id : 5]
+Left keys [1]: [sr_returned_date_sk#3]
+Right keys [1]: [d_date_sk#8]
+Join condition: None
-(12) Filter [codegen id : 4]
-Input [2]: [d_date_sk#7, d_date#8]
-Condition : isnotnull(d_date_sk#7)
+(12) Project [codegen id : 5]
+Output [2]: [sr_return_quantity#2, i_item_id#6]
+Input [4]: [sr_return_quantity#2, sr_returned_date_sk#3, i_item_id#6, d_date_sk#8]
-(13) Scan parquet default.date_dim
-Output [2]: [d_date#9, d_week_seq#10]
-Batched: true
-Location [not included in comparison]/{warehouse_dir}/date_dim]
-ReadSchema: struct<d_date:date,d_week_seq:int>
+(13) HashAggregate [codegen id : 5]
+Input [2]: [sr_return_quantity#2, i_item_id#6]
+Keys [1]: [i_item_id#6]
+Functions [1]: [partial_sum(sr_return_quantity#2)]
+Aggregate Attributes [1]: [sum#9]
+Results [2]: [i_item_id#6, sum#10]
-(14) ColumnarToRow [codegen id : 3]
-Input [2]: [d_date#9, d_week_seq#10]
+(14) Exchange
+Input [2]: [i_item_id#6, sum#10]
+Arguments: hashpartitioning(i_item_id#6, 5), ENSURE_REQUIREMENTS, [id=#11]
-(15) Scan parquet default.date_dim
-Output [2]: [d_date#11, d_week_seq#12]
-Batched: true
-Location [not included in comparison]/{warehouse_dir}/date_dim]
-ReadSchema: struct<d_date:date,d_week_seq:int>
+(15) HashAggregate [codegen id : 18]
+Input [2]: [i_item_id#6, sum#10]
+Keys [1]: [i_item_id#6]
+Functions [1]: [sum(sr_return_quantity#2)]
+Aggregate Attributes [1]: [sum(sr_return_quantity#2)#12]
+Results [2]: [i_item_id#6 AS item_id#13, sum(sr_return_quantity#2)#12 AS sr_item_qty#14]
-(16) ColumnarToRow [codegen id : 2]
-Input [2]: [d_date#11, d_week_seq#12]
+(16) Scan parquet default.catalog_returns
+Output [3]: [cr_item_sk#15, cr_return_quantity#16, cr_returned_date_sk#17]
+Batched: true
+Location: InMemoryFileIndex []
+PartitionFilters: [isnotnull(cr_returned_date_sk#17), dynamicpruningexpression(cr_returned_date_sk#17 IN dynamicpruning#4)]
+PushedFilters: [IsNotNull(cr_item_sk)]
+ReadSchema: struct<cr_item_sk:int,cr_return_quantity:int>
-(17) Filter [codegen id : 2]
-Input [2]: [d_date#11, d_week_seq#12]
-Condition : cast(d_date#11 as string) IN (2000-06-30,2000-09-27,2000-11-17)
+(17) ColumnarToRow [codegen id : 10]
+Input [3]: [cr_item_sk#15, cr_return_quantity#16, cr_returned_date_sk#17]
-(18) Project [codegen id : 2]
-Output [1]: [d_week_seq#12]
-Input [2]: [d_date#11, d_week_seq#12]
+(18) Filter [codegen id : 10]
+Input [3]: [cr_item_sk#15, cr_return_quantity#16, cr_returned_date_sk#17]
+Condition : isnotnull(cr_item_sk#15)
-(19) BroadcastExchange
-Input [1]: [d_week_seq#12]
-Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#13]
+(19) ReusedExchange [Reuses operator id: 7]
+Output [2]: [i_item_sk#18, i_item_id#19]
-(20) BroadcastHashJoin [codegen id : 3]
-Left keys [1]: [d_week_seq#10]
-Right keys [1]: [d_week_seq#12]
+(20) BroadcastHashJoin [codegen id : 10]
+Left keys [1]: [cr_item_sk#15]
+Right keys [1]: [i_item_sk#18]
Join condition: None
-(21) Project [codegen id : 3]
-Output [1]: [d_date#9]
-Input [2]: [d_date#9, d_week_seq#10]
+(21) Project [codegen id : 10]
+Output [3]: [cr_return_quantity#16, cr_returned_date_sk#17, i_item_id#19]
+Input [5]: [cr_item_sk#15, cr_return_quantity#16, cr_returned_date_sk#17, i_item_sk#18, i_item_id#19]
-(22) BroadcastExchange
-Input [1]: [d_date#9]
-Arguments: HashedRelationBroadcastMode(List(input[0, date, true]),false), [id=#14]
+(22) ReusedExchange [Reuses operator id: 62]
+Output [1]: [d_date_sk#20]
-(23) BroadcastHashJoin [codegen id : 4]
-Left keys [1]: [d_date#8]
-Right keys [1]: [d_date#9]
+(23) BroadcastHashJoin [codegen id : 10]
+Left keys [1]: [cr_returned_date_sk#17]
+Right keys [1]: [d_date_sk#20]
Join condition: None
-(24) Project [codegen id : 4]
-Output [1]: [d_date_sk#7]
-Input [2]: [d_date_sk#7, d_date#8]
-
-(25) BroadcastExchange
-Input [1]: [d_date_sk#7]
-Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#15]
-
-(26) BroadcastHashJoin [codegen id : 5]
-Left keys [1]: [sr_returned_date_sk#3]
-Right keys [1]: [d_date_sk#7]
+(24) Project [codegen id : 10]
+Output [2]: [cr_return_quantity#16, i_item_id#19]
+Input [4]: [cr_return_quantity#16, cr_returned_date_sk#17, i_item_id#19, d_date_sk#20]
+
+(25) HashAggregate [codegen id : 10]
+Input [2]: [cr_return_quantity#16, i_item_id#19]
+Keys [1]: [i_item_id#19]
+Functions [1]: [partial_sum(cr_return_quantity#16)]
+Aggregate Attributes [1]: [sum#21]
+Results [2]: [i_item_id#19, sum#22]
+
+(26) Exchange
+Input [2]: [i_item_id#19, sum#22]
+Arguments: hashpartitioning(i_item_id#19, 5), ENSURE_REQUIREMENTS, [id=#23]
+
+(27) HashAggregate [codegen id : 11]
+Input [2]: [i_item_id#19, sum#22]
+Keys [1]: [i_item_id#19]
+Functions [1]: [sum(cr_return_quantity#16)]
+Aggregate Attributes [1]: [sum(cr_return_quantity#16)#24]
+Results [2]: [i_item_id#19 AS item_id#25, sum(cr_return_quantity#16)#24 AS cr_item_qty#26]
+
+(28) BroadcastExchange
+Input [2]: [item_id#25, cr_item_qty#26]
+Arguments: HashedRelationBroadcastMode(List(input[0, string, true]),false), [id=#27]
+
+(29) BroadcastHashJoin [codegen id : 18]
+Left keys [1]: [item_id#13]
+Right keys [1]: [item_id#25]
Join condition: None
-(27) Project [codegen id : 5]
-Output [2]: [sr_return_quantity#2, i_item_id#5]
-Input [4]: [sr_return_quantity#2, sr_returned_date_sk#3, i_item_id#5, d_date_sk#7]
-
-(28) HashAggregate [codegen id : 5]
-Input [2]: [sr_return_quantity#2, i_item_id#5]
-Keys [1]: [i_item_id#5]
-Functions [1]: [partial_sum(sr_return_quantity#2)]
-Aggregate Attributes [1]: [sum#16]
-Results [2]: [i_item_id#5, sum#17]
-
-(29) Exchange
-Input [2]: [i_item_id#5, sum#17]
-Arguments: hashpartitioning(i_item_id#5, 5), ENSURE_REQUIREMENTS, [id=#18]
-
-(30) HashAggregate [codegen id : 18]
-Input [2]: [i_item_id#5, sum#17]
-Keys [1]: [i_item_id#5]
-Functions [1]: [sum(sr_return_quantity#2)]
-Aggregate Attributes [1]: [sum(sr_return_quantity#2)#19]
-Results [2]: [i_item_id#5 AS item_id#20, sum(sr_return_quantity#2)#19 AS sr_item_qty#21]
+(30) Project [codegen id : 18]
+Output [3]: [item_id#13, sr_item_qty#14, cr_item_qty#26]
+Input [4]: [item_id#13, sr_item_qty#14, item_id#25, cr_item_qty#26]
-(31) Scan parquet default.catalog_returns
-Output [3]: [cr_item_sk#22, cr_return_quantity#23, cr_returned_date_sk#24]
+(31) Scan parquet default.web_returns
+Output [3]: [wr_item_sk#28, wr_return_quantity#29, wr_returned_date_sk#30]
Batched: true
Location: InMemoryFileIndex []
-PartitionFilters: [isnotnull(cr_returned_date_sk#24), dynamicpruningexpression(true)]
-PushedFilters: [IsNotNull(cr_item_sk)]
-ReadSchema: struct<cr_item_sk:int,cr_return_quantity:int>
+PartitionFilters: [isnotnull(wr_returned_date_sk#30), dynamicpruningexpression(wr_returned_date_sk#30 IN dynamicpruning#4)]
+PushedFilters: [IsNotNull(wr_item_sk)]
+ReadSchema: struct<wr_item_sk:int,wr_return_quantity:int>
-(32) ColumnarToRow [codegen id : 10]
-Input [3]: [cr_item_sk#22, cr_return_quantity#23, cr_returned_date_sk#24]
+(32) ColumnarToRow [codegen id : 16]
+Input [3]: [wr_item_sk#28, wr_return_quantity#29, wr_returned_date_sk#30]
-(33) Filter [codegen id : 10]
-Input [3]: [cr_item_sk#22, cr_return_quantity#23, cr_returned_date_sk#24]
-Condition : isnotnull(cr_item_sk#22)
+(33) Filter [codegen id : 16]
+Input [3]: [wr_item_sk#28, wr_return_quantity#29, wr_returned_date_sk#30]
+Condition : isnotnull(wr_item_sk#28)
(34) ReusedExchange [Reuses operator id: 7]
-Output [2]: [i_item_sk#25, i_item_id#26]
+Output [2]: [i_item_sk#31, i_item_id#32]
-(35) BroadcastHashJoin [codegen id : 10]
-Left keys [1]: [cr_item_sk#22]
-Right keys [1]: [i_item_sk#25]
+(35) BroadcastHashJoin [codegen id : 16]
+Left keys [1]: [wr_item_sk#28]
+Right keys [1]: [i_item_sk#31]
Join condition: None
-(36) Project [codegen id : 10]
-Output [3]: [cr_return_quantity#23, cr_returned_date_sk#24, i_item_id#26]
-Input [5]: [cr_item_sk#22, cr_return_quantity#23, cr_returned_date_sk#24, i_item_sk#25, i_item_id#26]
+(36) Project [codegen id : 16]
+Output [3]: [wr_return_quantity#29, wr_returned_date_sk#30, i_item_id#32]
+Input [5]: [wr_item_sk#28, wr_return_quantity#29, wr_returned_date_sk#30, i_item_sk#31, i_item_id#32]
-(37) ReusedExchange [Reuses operator id: 25]
-Output [1]: [d_date_sk#27]
+(37) ReusedExchange [Reuses operator id: 62]
+Output [1]: [d_date_sk#33]
-(38) BroadcastHashJoin [codegen id : 10]
-Left keys [1]: [cr_returned_date_sk#24]
-Right keys [1]: [d_date_sk#27]
+(38) BroadcastHashJoin [codegen id : 16]
+Left keys [1]: [wr_returned_date_sk#30]
+Right keys [1]: [d_date_sk#33]
Join condition: None
-(39) Project [codegen id : 10]
-Output [2]: [cr_return_quantity#23, i_item_id#26]
-Input [4]: [cr_return_quantity#23, cr_returned_date_sk#24, i_item_id#26, d_date_sk#27]
+(39) Project [codegen id : 16]
+Output [2]: [wr_return_quantity#29, i_item_id#32]
+Input [4]: [wr_return_quantity#29, wr_returned_date_sk#30, i_item_id#32, d_date_sk#33]
-(40) HashAggregate [codegen id : 10]
-Input [2]: [cr_return_quantity#23, i_item_id#26]
-Keys [1]: [i_item_id#26]
-Functions [1]: [partial_sum(cr_return_quantity#23)]
-Aggregate Attributes [1]: [sum#28]
-Results [2]: [i_item_id#26, sum#29]
+(40) HashAggregate [codegen id : 16]
+Input [2]: [wr_return_quantity#29, i_item_id#32]
+Keys [1]: [i_item_id#32]
+Functions [1]: [partial_sum(wr_return_quantity#29)]
+Aggregate Attributes [1]: [sum#34]
+Results [2]: [i_item_id#32, sum#35]
(41) Exchange
-Input [2]: [i_item_id#26, sum#29]
-Arguments: hashpartitioning(i_item_id#26, 5), ENSURE_REQUIREMENTS, [id=#30]
+Input [2]: [i_item_id#32, sum#35]
+Arguments: hashpartitioning(i_item_id#32, 5), ENSURE_REQUIREMENTS, [id=#36]
-(42) HashAggregate [codegen id : 11]
-Input [2]: [i_item_id#26, sum#29]
-Keys [1]: [i_item_id#26]
-Functions [1]: [sum(cr_return_quantity#23)]
-Aggregate Attributes [1]: [sum(cr_return_quantity#23)#31]
-Results [2]: [i_item_id#26 AS item_id#32, sum(cr_return_quantity#23)#31 AS cr_item_qty#33]
+(42) HashAggregate [codegen id : 17]
+Input [2]: [i_item_id#32, sum#35]
+Keys [1]: [i_item_id#32]
+Functions [1]: [sum(wr_return_quantity#29)]
+Aggregate Attributes [1]: [sum(wr_return_quantity#29)#37]
+Results [2]: [i_item_id#32 AS item_id#38, sum(wr_return_quantity#29)#37 AS wr_item_qty#39]
(43) BroadcastExchange
-Input [2]: [item_id#32, cr_item_qty#33]
-Arguments: HashedRelationBroadcastMode(List(input[0, string, true]),false), [id=#34]
+Input [2]: [item_id#38, wr_item_qty#39]
+Arguments: HashedRelationBroadcastMode(List(input[0, string, true]),false), [id=#40]
(44) BroadcastHashJoin [codegen id : 18]
-Left keys [1]: [item_id#20]
-Right keys [1]: [item_id#32]
+Left keys [1]: [item_id#13]
+Right keys [1]: [item_id#38]
Join condition: None
(45) Project [codegen id : 18]
-Output [3]: [item_id#20, sr_item_qty#21, cr_item_qty#33]
-Input [4]: [item_id#20, sr_item_qty#21, item_id#32, cr_item_qty#33]
+Output [8]: [item_id#13, sr_item_qty#14, (((cast(sr_item_qty#14 as double) / cast(((sr_item_qty#14 + cr_item_qty#26) + wr_item_qty#39) as double)) / 3.0) * 100.0) AS sr_dev#41, cr_item_qty#26, (((cast(cr_item_qty#26 as double) / cast(((sr_item_qty#14 + cr_item_qty#26) + wr_item_qty#39) as double)) / 3.0) * 100.0) AS cr_dev#42, wr_item_qty#39, (((cast(wr_item_qty#39 as double) / cast(((sr_item_qty#14 + cr_item_qty#26) + wr_item_qty#39) as double)) / 3.0) * 100.0) AS wr_dev#43, CheckOverfl [...]
+Input [5]: [item_id#13, sr_item_qty#14, cr_item_qty#26, item_id#38, wr_item_qty#39]
+
+(46) TakeOrderedAndProject
+Input [8]: [item_id#13, sr_item_qty#14, sr_dev#41, cr_item_qty#26, cr_dev#42, wr_item_qty#39, wr_dev#43, average#44]
+Arguments: 100, [item_id#13 ASC NULLS FIRST, sr_item_qty#14 ASC NULLS FIRST], [item_id#13, sr_item_qty#14, sr_dev#41, cr_item_qty#26, cr_dev#42, wr_item_qty#39, wr_dev#43, average#44]
+
+===== Subqueries =====
+
+Subquery:1 Hosting operator id = 1 Hosting Expression = sr_returned_date_sk#3 IN dynamicpruning#4
+BroadcastExchange (62)
++- * Project (61)
+ +- * BroadcastHashJoin LeftSemi BuildRight (60)
+ :- * Filter (49)
+ : +- * ColumnarToRow (48)
+ : +- Scan parquet default.date_dim (47)
+ +- BroadcastExchange (59)
+ +- * Project (58)
+ +- * BroadcastHashJoin LeftSemi BuildRight (57)
+ :- * ColumnarToRow (51)
+ : +- Scan parquet default.date_dim (50)
+ +- BroadcastExchange (56)
+ +- * Project (55)
+ +- * Filter (54)
+ +- * ColumnarToRow (53)
+ +- Scan parquet default.date_dim (52)
+
+
+(47) Scan parquet default.date_dim
+Output [2]: [d_date_sk#8, d_date#45]
+Batched: true
+Location [not included in comparison]/{warehouse_dir}/date_dim]
+PushedFilters: [IsNotNull(d_date_sk)]
+ReadSchema: struct<d_date_sk:int,d_date:date>
+
+(48) ColumnarToRow [codegen id : 3]
+Input [2]: [d_date_sk#8, d_date#45]
-(46) Scan parquet default.web_returns
-Output [3]: [wr_item_sk#35, wr_return_quantity#36, wr_returned_date_sk#37]
+(49) Filter [codegen id : 3]
+Input [2]: [d_date_sk#8, d_date#45]
+Condition : isnotnull(d_date_sk#8)
+
+(50) Scan parquet default.date_dim
+Output [2]: [d_date#46, d_week_seq#47]
Batched: true
-Location: InMemoryFileIndex []
-PartitionFilters: [isnotnull(wr_returned_date_sk#37), dynamicpruningexpression(true)]
-PushedFilters: [IsNotNull(wr_item_sk)]
-ReadSchema: struct<wr_item_sk:int,wr_return_quantity:int>
+Location [not included in comparison]/{warehouse_dir}/date_dim]
+ReadSchema: struct<d_date:date,d_week_seq:int>
+
+(51) ColumnarToRow [codegen id : 2]
+Input [2]: [d_date#46, d_week_seq#47]
+
+(52) Scan parquet default.date_dim
+Output [2]: [d_date#48, d_week_seq#49]
+Batched: true
+Location [not included in comparison]/{warehouse_dir}/date_dim]
+ReadSchema: struct<d_date:date,d_week_seq:int>
+
+(53) ColumnarToRow [codegen id : 1]
+Input [2]: [d_date#48, d_week_seq#49]
-(47) ColumnarToRow [codegen id : 16]
-Input [3]: [wr_item_sk#35, wr_return_quantity#36, wr_returned_date_sk#37]
+(54) Filter [codegen id : 1]
+Input [2]: [d_date#48, d_week_seq#49]
+Condition : cast(d_date#48 as string) IN (2000-06-30,2000-09-27,2000-11-17)
-(48) Filter [codegen id : 16]
-Input [3]: [wr_item_sk#35, wr_return_quantity#36, wr_returned_date_sk#37]
-Condition : isnotnull(wr_item_sk#35)
+(55) Project [codegen id : 1]
+Output [1]: [d_week_seq#49]
+Input [2]: [d_date#48, d_week_seq#49]
-(49) ReusedExchange [Reuses operator id: 7]
-Output [2]: [i_item_sk#38, i_item_id#39]
+(56) BroadcastExchange
+Input [1]: [d_week_seq#49]
+Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#50]
-(50) BroadcastHashJoin [codegen id : 16]
-Left keys [1]: [wr_item_sk#35]
-Right keys [1]: [i_item_sk#38]
+(57) BroadcastHashJoin [codegen id : 2]
+Left keys [1]: [d_week_seq#47]
+Right keys [1]: [d_week_seq#49]
Join condition: None
-(51) Project [codegen id : 16]
-Output [3]: [wr_return_quantity#36, wr_returned_date_sk#37, i_item_id#39]
-Input [5]: [wr_item_sk#35, wr_return_quantity#36, wr_returned_date_sk#37, i_item_sk#38, i_item_id#39]
+(58) Project [codegen id : 2]
+Output [1]: [d_date#46]
+Input [2]: [d_date#46, d_week_seq#47]
-(52) ReusedExchange [Reuses operator id: 25]
-Output [1]: [d_date_sk#40]
+(59) BroadcastExchange
+Input [1]: [d_date#46]
+Arguments: HashedRelationBroadcastMode(List(input[0, date, true]),false), [id=#51]
-(53) BroadcastHashJoin [codegen id : 16]
-Left keys [1]: [wr_returned_date_sk#37]
-Right keys [1]: [d_date_sk#40]
+(60) BroadcastHashJoin [codegen id : 3]
+Left keys [1]: [d_date#45]
+Right keys [1]: [d_date#46]
Join condition: None
-(54) Project [codegen id : 16]
-Output [2]: [wr_return_quantity#36, i_item_id#39]
-Input [4]: [wr_return_quantity#36, wr_returned_date_sk#37, i_item_id#39, d_date_sk#40]
-
-(55) HashAggregate [codegen id : 16]
-Input [2]: [wr_return_quantity#36, i_item_id#39]
-Keys [1]: [i_item_id#39]
-Functions [1]: [partial_sum(wr_return_quantity#36)]
-Aggregate Attributes [1]: [sum#41]
-Results [2]: [i_item_id#39, sum#42]
-
-(56) Exchange
-Input [2]: [i_item_id#39, sum#42]
-Arguments: hashpartitioning(i_item_id#39, 5), ENSURE_REQUIREMENTS, [id=#43]
-
-(57) HashAggregate [codegen id : 17]
-Input [2]: [i_item_id#39, sum#42]
-Keys [1]: [i_item_id#39]
-Functions [1]: [sum(wr_return_quantity#36)]
-Aggregate Attributes [1]: [sum(wr_return_quantity#36)#44]
-Results [2]: [i_item_id#39 AS item_id#45, sum(wr_return_quantity#36)#44 AS wr_item_qty#46]
-
-(58) BroadcastExchange
-Input [2]: [item_id#45, wr_item_qty#46]
-Arguments: HashedRelationBroadcastMode(List(input[0, string, true]),false), [id=#47]
-
-(59) BroadcastHashJoin [codegen id : 18]
-Left keys [1]: [item_id#20]
-Right keys [1]: [item_id#45]
-Join condition: None
+(61) Project [codegen id : 3]
+Output [1]: [d_date_sk#8]
+Input [2]: [d_date_sk#8, d_date#45]
+
+(62) BroadcastExchange
+Input [1]: [d_date_sk#8]
+Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#52]
+
+Subquery:2 Hosting operator id = 16 Hosting Expression = cr_returned_date_sk#17 IN dynamicpruning#4
-(60) Project [codegen id : 18]
-Output [8]: [item_id#20, sr_item_qty#21, (((cast(sr_item_qty#21 as double) / cast(((sr_item_qty#21 + cr_item_qty#33) + wr_item_qty#46) as double)) / 3.0) * 100.0) AS sr_dev#48, cr_item_qty#33, (((cast(cr_item_qty#33 as double) / cast(((sr_item_qty#21 + cr_item_qty#33) + wr_item_qty#46) as double)) / 3.0) * 100.0) AS cr_dev#49, wr_item_qty#46, (((cast(wr_item_qty#46 as double) / cast(((sr_item_qty#21 + cr_item_qty#33) + wr_item_qty#46) as double)) / 3.0) * 100.0) AS wr_dev#50, CheckOverfl [...]
-Input [5]: [item_id#20, sr_item_qty#21, cr_item_qty#33, item_id#45, wr_item_qty#46]
+Subquery:3 Hosting operator id = 31 Hosting Expression = wr_returned_date_sk#30 IN dynamicpruning#4
-(61) TakeOrderedAndProject
-Input [8]: [item_id#20, sr_item_qty#21, sr_dev#48, cr_item_qty#33, cr_dev#49, wr_item_qty#46, wr_dev#50, average#51]
-Arguments: 100, [item_id#20 ASC NULLS FIRST, sr_item_qty#21 ASC NULLS FIRST], [item_id#20, sr_item_qty#21, sr_dev#48, cr_item_qty#33, cr_dev#49, wr_item_qty#46, wr_dev#50, average#51]
diff --git a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q83/simplified.txt b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q83/simplified.txt
index 7fc930f..29ff19d 100644
--- a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q83/simplified.txt
+++ b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q83/simplified.txt
@@ -17,38 +17,40 @@ TakeOrderedAndProject [item_id,sr_item_qty,sr_dev,cr_item_qty,cr_dev,wr_item_qty
ColumnarToRow
InputAdapter
Scan parquet default.store_returns [sr_item_sk,sr_return_quantity,sr_returned_date_sk]
+ SubqueryBroadcast [d_date_sk] #1
+ BroadcastExchange #2
+ WholeStageCodegen (3)
+ Project [d_date_sk]
+ BroadcastHashJoin [d_date,d_date]
+ Filter [d_date_sk]
+ ColumnarToRow
+ InputAdapter
+ Scan parquet default.date_dim [d_date_sk,d_date]
+ InputAdapter
+ BroadcastExchange #3
+ WholeStageCodegen (2)
+ Project [d_date]
+ BroadcastHashJoin [d_week_seq,d_week_seq]
+ ColumnarToRow
+ InputAdapter
+ Scan parquet default.date_dim [d_date,d_week_seq]
+ InputAdapter
+ BroadcastExchange #4
+ WholeStageCodegen (1)
+ Project [d_week_seq]
+ Filter [d_date]
+ ColumnarToRow
+ InputAdapter
+ Scan parquet default.date_dim [d_date,d_week_seq]
InputAdapter
- BroadcastExchange #2
+ BroadcastExchange #5
WholeStageCodegen (1)
Filter [i_item_sk,i_item_id]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_item_sk,i_item_id]
InputAdapter
- BroadcastExchange #3
- WholeStageCodegen (4)
- Project [d_date_sk]
- BroadcastHashJoin [d_date,d_date]
- Filter [d_date_sk]
- ColumnarToRow
- InputAdapter
- Scan parquet default.date_dim [d_date_sk,d_date]
- InputAdapter
- BroadcastExchange #4
- WholeStageCodegen (3)
- Project [d_date]
- BroadcastHashJoin [d_week_seq,d_week_seq]
- ColumnarToRow
- InputAdapter
- Scan parquet default.date_dim [d_date,d_week_seq]
- InputAdapter
- BroadcastExchange #5
- WholeStageCodegen (2)
- Project [d_week_seq]
- Filter [d_date]
- ColumnarToRow
- InputAdapter
- Scan parquet default.date_dim [d_date,d_week_seq]
+ ReusedExchange [d_date_sk] #2
InputAdapter
BroadcastExchange #6
WholeStageCodegen (11)
@@ -65,10 +67,11 @@ TakeOrderedAndProject [item_id,sr_item_qty,sr_dev,cr_item_qty,cr_dev,wr_item_qty
ColumnarToRow
InputAdapter
Scan parquet default.catalog_returns [cr_item_sk,cr_return_quantity,cr_returned_date_sk]
+ ReusedSubquery [d_date_sk] #1
InputAdapter
- ReusedExchange [i_item_sk,i_item_id] #2
+ ReusedExchange [i_item_sk,i_item_id] #5
InputAdapter
- ReusedExchange [d_date_sk] #3
+ ReusedExchange [d_date_sk] #2
InputAdapter
BroadcastExchange #8
WholeStageCodegen (17)
@@ -85,7 +88,8 @@ TakeOrderedAndProject [item_id,sr_item_qty,sr_dev,cr_item_qty,cr_dev,wr_item_qty
ColumnarToRow
InputAdapter
Scan parquet default.web_returns [wr_item_sk,wr_return_quantity,wr_returned_date_sk]
+ ReusedSubquery [d_date_sk] #1
InputAdapter
- ReusedExchange [i_item_sk,i_item_id] #2
+ ReusedExchange [i_item_sk,i_item_id] #5
InputAdapter
- ReusedExchange [d_date_sk] #3
+ ReusedExchange [d_date_sk] #2
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/DynamicPartitionPruningSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/DynamicPartitionPruningSuite.scala
index 38527fb..f058bb5 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DynamicPartitionPruningSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DynamicPartitionPruningSuite.scala
@@ -1491,6 +1491,20 @@ abstract class DynamicPartitionPruningSuiteBase
checkAnswer(df, Row(3, 2) :: Row(3, 2) :: Row(3, 2) :: Row(3, 2) :: Nil)
}
+
+ test("SPARK-36444: Remove OptimizeSubqueries from batch of PartitionPruning") {
+ withSQLConf(SQLConf.DYNAMIC_PARTITION_PRUNING_ENABLED.key -> "true") {
+ val df = sql(
+ """
+ |SELECT date_id, product_id FROM fact_sk f
+ |JOIN (select store_id + 3 as new_store_id from dim_store where country = 'US') s
+ |ON f.store_id = s.new_store_id
+ """.stripMargin)
+
+ checkPartitionPruningPredicate(df, false, true)
+ checkAnswer(df, Row(1150, 1) :: Row(1130, 4) :: Row(1140, 4) :: Nil)
+ }
+ }
}
abstract class DynamicPartitionPruningV1Suite extends DynamicPartitionPruningSuiteBase {
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org