You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "mingmwang (via GitHub)" <gi...@apache.org> on 2023/06/13 16:48:59 UTC

[GitHub] [arrow-datafusion] mingmwang commented on a diff in pull request #6659: fix: port unstable subquery to sqllogicaltest

mingmwang commented on code in PR #6659:
URL: https://github.com/apache/arrow-datafusion/pull/6659#discussion_r1228429913


##########
datafusion/core/tests/sqllogictests/test_files/subquery.slt:
##########
@@ -107,3 +107,144 @@ from t1
 where t1_int = (select max(i) from (values (1)) as s(i));
 ----
 11
+
+# aggregated_correlated_scalar_subquery
+query TT
+explain SELECT t1_id, (SELECT sum(t2_int) FROM t2 WHERE t2.t2_id = t1.t1_id) as t2_sum from t1
+----
+logical_plan
+Projection: t1.t1_id, __scalar_sq_2.SUM(t2.t2_int) AS t2_sum
+--Left Join: t1.t1_id = __scalar_sq_2.t2_id
+----TableScan: t1 projection=[t1_id]
+----SubqueryAlias: __scalar_sq_2
+------Projection: SUM(t2.t2_int), t2.t2_id
+--------Aggregate: groupBy=[[t2.t2_id]], aggr=[[SUM(t2.t2_int)]]
+----------TableScan: t2 projection=[t2_id, t2_int]
+physical_plan
+ProjectionExec: expr=[t1_id@0 as t1_id, SUM(t2.t2_int)@1 as t2_sum]
+--CoalesceBatchesExec: target_batch_size=8192
+----HashJoinExec: mode=Partitioned, join_type=Left, on=[(Column { name: "t1_id", index: 0 }, Column { name: "t2_id", index: 1 })]
+------CoalesceBatchesExec: target_batch_size=8192
+--------RepartitionExec: partitioning=Hash([Column { name: "t1_id", index: 0 }], 4), input_partitions=4
+----------MemoryExec: partitions=4, partition_sizes=[1, 0, 0, 0]
+------ProjectionExec: expr=[SUM(t2.t2_int)@1 as SUM(t2.t2_int), t2_id@0 as t2_id]
+--------AggregateExec: mode=FinalPartitioned, gby=[t2_id@0 as t2_id], aggr=[SUM(t2.t2_int)]
+----------CoalesceBatchesExec: target_batch_size=8192
+------------RepartitionExec: partitioning=Hash([Column { name: "t2_id", index: 0 }], 4), input_partitions=4
+--------------AggregateExec: mode=Partial, gby=[t2_id@0 as t2_id], aggr=[SUM(t2.t2_int)]
+----------------MemoryExec: partitions=4, partition_sizes=[1, 0, 0, 0]
+
+query II

Review Comment:
   Yes, I think the unstable is due to result without sorting.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org