You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "ozankabak (via GitHub)" <gi...@apache.org> on 2023/02/06 19:30:43 UTC

[GitHub] [arrow-datafusion] ozankabak commented on a diff in pull request #5171: Make EnforceSorting global sort aware, fix sort mis-optimizations involving unions, support parallel sort + merge transformations

ozankabak commented on code in PR #5171:
URL: https://github.com/apache/arrow-datafusion/pull/5171#discussion_r1097831067


##########
datafusion/core/tests/sql/joins.rs:
##########
@@ -1980,8 +1980,8 @@ async fn left_semi_join() -> Result<()> {
         let physical_plan = dataframe.create_physical_plan().await?;
         let expected = if repartition_joins {
             vec![
-                "SortExec: [t1_id@0 ASC NULLS LAST]",
-                "  CoalescePartitionsExec",
+                "SortPreservingMergeExec: [t1_id@0 ASC NULLS LAST]",
+                "  SortExec: [t1_id@0 ASC NULLS LAST]",

Review Comment:
   It replaces coalesce + a single sort with multiple (parallel) sorts + merge, which is a better choice in multi-threaded environments with an efficient merge operator.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org