You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2023/06/13 14:12:31 UTC

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6659: fix: port unstable subquery to sqllogicaltest

alamb commented on code in PR #6659:
URL: https://github.com/apache/arrow-datafusion/pull/6659#discussion_r1228199477


##########
datafusion/core/tests/sqllogictests/test_files/subquery.slt:
##########
@@ -107,3 +107,144 @@ from t1
 where t1_int = (select max(i) from (values (1)) as s(i));
 ----
 11
+
+# aggregated_correlated_scalar_subquery
+query TT
+explain SELECT t1_id, (SELECT sum(t2_int) FROM t2 WHERE t2.t2_id = t1.t1_id) as t2_sum from t1
+----
+logical_plan
+Projection: t1.t1_id, __scalar_sq_2.SUM(t2.t2_int) AS t2_sum
+--Left Join: t1.t1_id = __scalar_sq_2.t2_id
+----TableScan: t1 projection=[t1_id]
+----SubqueryAlias: __scalar_sq_2
+------Projection: SUM(t2.t2_int), t2.t2_id
+--------Aggregate: groupBy=[[t2.t2_id]], aggr=[[SUM(t2.t2_int)]]
+----------TableScan: t2 projection=[t2_id, t2_int]
+physical_plan
+ProjectionExec: expr=[t1_id@0 as t1_id, SUM(t2.t2_int)@1 as t2_sum]
+--CoalesceBatchesExec: target_batch_size=8192
+----HashJoinExec: mode=Partitioned, join_type=Left, on=[(Column { name: "t1_id", index: 0 }, Column { name: "t2_id", index: 1 })]
+------CoalesceBatchesExec: target_batch_size=8192
+--------RepartitionExec: partitioning=Hash([Column { name: "t1_id", index: 0 }], 4), input_partitions=4
+----------MemoryExec: partitions=4, partition_sizes=[1, 0, 0, 0]
+------ProjectionExec: expr=[SUM(t2.t2_int)@1 as SUM(t2.t2_int), t2_id@0 as t2_id]
+--------AggregateExec: mode=FinalPartitioned, gby=[t2_id@0 as t2_id], aggr=[SUM(t2.t2_int)]
+----------CoalesceBatchesExec: target_batch_size=8192
+------------RepartitionExec: partitioning=Hash([Column { name: "t2_id", index: 0 }], 4), input_partitions=4
+--------------AggregateExec: mode=Partial, gby=[t2_id@0 as t2_id], aggr=[SUM(t2.t2_int)]
+----------------MemoryExec: partitions=4, partition_sizes=[1, 0, 0, 0]
+
+query II

Review Comment:
   I think you have to put `rowsort` on this line to make sure the order is consistent given the query doesn't ensure it:
   
   ```suggestion
   query II rowsort
   ```
   
   Thanks @jackwener 
   
   ```
   [Diff] (-expected|+actual)
   +   22 1
       33 NULL
       11 3
   -   22 1
       44 3
   at tests/sqllogictests/test_files/subquery.slt:137
   ```
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org