You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/21 13:02:34 UTC

[GitHub] [arrow-datafusion] tustvold opened a new pull request #378: Add support for multiple partitions with SortExec

tustvold opened a new pull request #378:
URL: https://github.com/apache/arrow-datafusion/pull/378


   # Which issue does this PR close?
   
   Closes #362
   
    # Rationale for this change
   
   Once an order preserving merge operator is added as part of #362 it will be possible to combine multiple sorted partitions together into a single partition - effectively yielding partitioned sort. Loosening the restriction on SortExec to a single partition allows it to form the sort part of this.
   
   # What changes are included in this PR?
   
   SortExec is no longer restricted to a single partition, instead preserving the partitioning of its inputs
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-datafusion] Dandandan edited a comment on pull request #378: Add support for multiple partitions with SortExec (#362)

Posted by GitBox <gi...@apache.org>.

Dandandan edited a comment on pull request #378:
URL: https://github.com/apache/arrow-datafusion/pull/378#issuecomment-846048412


   Currently sort needs a single partition as otherwise the partitions are not sorted. A mergeexec currently is added based on this requirement.
   
   So this won't work I think untill we have the implementation to merge the sorted partitions which you are working on in https://github.com/apache/arrow-datafusion/pull/379


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-datafusion] alamb commented on pull request #378: Add support for multiple partitions with SortExec (#362)

Posted by GitBox <gi...@apache.org>.

alamb commented on pull request #378:
URL: https://github.com/apache/arrow-datafusion/pull/378#issuecomment-847217916


   > I think you are right about that, we should not rely on the optimizer to make the execution plan correct. I think it would be better if the planner adds the MergeExecs for the appropriate nodes.
   
   I agree with @tustvold  and @Dandandan on this -- I think the plan should generate correct results without requiring optimizer passes being run. The optimizer passes should just (potentially) make the plans faster. 
   
    > I therefore think the addition of a preserve_partitioning flag to SortExec is necessary and has precedent.
   
   I agree
   
   > Currently Repartition may insert RepartitionExec between an operator and its children, provided that operator doesn't require a single partition. It is then reliant on a later optimisation pass with AddMergeExec to join together the partitions if a operator further up the tree requires it.
   
   Is there any reason we can't call `AddMergeExec` multiple times? Once (and always) as part of creating the physical plans and then potentially again as part of `Repartition`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-datafusion] tustvold commented on pull request #378: Add support for multiple partitions with SortExec (#362)

Posted by GitBox <gi...@apache.org>.

tustvold commented on pull request #378:
URL: https://github.com/apache/arrow-datafusion/pull/378#issuecomment-847976823


   I think this is "as correct as current master" and therefore marking this as ready for review. It is impacted by #423, however, so is current master, and so I think this is a separate issue that can be fixed independently of this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-datafusion] Dandandan commented on pull request #378: Add support for multiple partitions with SortExec (#362)

Posted by GitBox <gi...@apache.org>.

Dandandan commented on pull request #378:
URL: https://github.com/apache/arrow-datafusion/pull/378#issuecomment-847031288


   > I've added a new constructor that allows opting into the new behaviour. I wasn't aware of the way that MergeExec is plumbed into the plans and that this would break it.
   > 
   > I do wonder if instead of relying on an `AddMergeExec` optimisation pass, the plan conversion from `LogicalPlan::Sort` should just inspect the input partitioning and add the Merge if necessary. After all, it already has to inspect the partitioning for operators such as `LogicalPlan::Limit`, and so not just generating a valid plan from the outset seems a touch surprising to me...
   
   I think you are right about that, we should not rely on the optimizer to make the execution plan correct. I think it would be better if the planner adds the merge execs for the appropriate nodes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-datafusion] tustvold commented on pull request #378: Add support for multiple partitions with SortExec (#362)

Posted by GitBox <gi...@apache.org>.

tustvold commented on pull request #378:
URL: https://github.com/apache/arrow-datafusion/pull/378#issuecomment-847017349


   I've added a new constructor that allows opting into the new behaviour. I wasn't aware of the way that MergeExec is plumbed into the plans and that this would break it.
   
   I do wonder if instead of relying on an `AddMergeExec` optimisation pass, the plan conversion from `LogicalPlan::Sort` should just inspect the input partitioning and add the Merge if necessary. After all, it already has to inspect the partitioning for operators such as `LogicalPlan::Limit`, and so not just generating a valid plan from the outset seems a touch surprising to me...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-datafusion] tustvold edited a comment on pull request #378: Add support for multiple partitions with SortExec (#362)

Posted by GitBox <gi...@apache.org>.

tustvold edited a comment on pull request #378:
URL: https://github.com/apache/arrow-datafusion/pull/378#issuecomment-847158217


   I did some more digging into this and created #412 to track the fact that PhysicalPlanner currently creates plans that are incorrect.
   
   However, I think the issue is actually a bit more subtle than I first realised. Currently `Repartition` may insert `RepartitionExec` between an operator and its children, provided that operator doesn't require a single partition. It is then reliant on a later optimisation pass with `AddMergeExec` to join together the partitions if a operator further up the tree requires it.
   
   This means that the operators inserted by PhysicalPlan must somehow remember the partitioning they need to be correct, in order to prevent the optimiser from breaking them, simply adding MergeExec when generating the initial plan is insufficient.
   
   There are a couple of ways this gets handled that I can see:
   
   * Limit has two separate operators - `GlobalLimitExec` and `LocalLimitExec`
   * `HashAggregateExec` has an `AggregateMode` enumeration
   
   I therefore think the addition of a `preserve_partitioning` flag to `SortExec` is necessary and has precedent.
   
   However, it is unfortunately insufficient as nothing prevents `Repartition` from repartitioning a sorted partition (I think this might be an issue more generally). I need to think on this more, perhaps as @alamb mentioned on #379 there needs to be a concept of sorted-ness introduced for operators that optimisation passes such as `Repartition` and `AddMergeExec` would respect.
   
   Going to mark this as a draft for now, as the above will have implications for what the best way forward for this is


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-datafusion] Dandandan commented on pull request #378: Add support for multiple partitions with SortExec (#362)

Posted by GitBox <gi...@apache.org>.

Dandandan commented on pull request #378:
URL: https://github.com/apache/arrow-datafusion/pull/378#issuecomment-846048412


   Currently sort needs a single partition as otherwise the partitions are not sorted. A mergeexec currently is added based on this requirement.
   
   So this won't work I think untill we have the implementation to merge the sorted partitions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-datafusion] Dandandan edited a comment on pull request #378: Add support for multiple partitions with SortExec (#362)

Posted by GitBox <gi...@apache.org>.

Dandandan edited a comment on pull request #378:
URL: https://github.com/apache/arrow-datafusion/pull/378#issuecomment-847031288


   > I've added a new constructor that allows opting into the new behaviour. I wasn't aware of the way that MergeExec is plumbed into the plans and that this would break it.
   > 
   > I do wonder if instead of relying on an `AddMergeExec` optimisation pass, the plan conversion from `LogicalPlan::Sort` should just inspect the input partitioning and add the Merge if necessary. After all, it already has to inspect the partitioning for operators such as `LogicalPlan::Limit`, and so not just generating a valid plan from the outset seems a touch surprising to me...
   
   I think you are right about that, we should not rely on the optimizer to make the execution plan correct. I think it would be better if the planner adds the `MergeExec`s for the appropriate nodes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-datafusion] tustvold edited a comment on pull request #378: Add support for multiple partitions with SortExec (#362)

Posted by GitBox <gi...@apache.org>.

tustvold edited a comment on pull request #378:
URL: https://github.com/apache/arrow-datafusion/pull/378#issuecomment-847976823


   I think this is "as correct as current master" and therefore marking this as ready for review. It is impacted by #423 (the issue alluded to above r.e. the `Repartition` pass), however, so is current master, and so I think this is a separate issue that can be fixed independently of this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-datafusion] tustvold commented on pull request #378: Add support for multiple partitions with SortExec (#362)

Posted by GitBox <gi...@apache.org>.

tustvold commented on pull request #378:
URL: https://github.com/apache/arrow-datafusion/pull/378#issuecomment-847158217

I did some more digging into this and created #412 to track the fact that PhysicalPlanner currently creates plans that are incorrect.

However, I think the issue is actually a bit more subtle than I first realised. Currently `Repartition` may insert `RepartitionExec` between an operator and its children, provided that operator doesn't require a single partition. It is then reliant on a later optimisation pass with AddMergeExec to join together the partitions if a operator further up the tree requires it.

This means that the operators inserted by PhysicalPlan must somehow remember the partitioning they need to be correct, in order to prevent the optimiser from breaking them, simply adding MergeExec when generating the initial plan is insufficient.

There are a couple of ways this gets handled that I can see:

* Limit has two separate operators - `GlobalLimitExec` and `LocalLimitExec`
* `HashAggregateExec` has an `AggregateMode` enumeration

I therefore think the addition of a `preserve_partitioning` flag to `SortExec` is necessary and has precedent.

However, it is unfortunately insufficient as nothing prevents `Repartition` from repartitioning a sorted partition (I think this might be an issue more generally). I need to think on this more, perhaps as @alamb mentioned on #379 there needs to be a concept of sorted-ness introduced for operators that optimisation passes such as `Repartition` and `AddMergeExec` would respect.

Going to mark this as a draft for now, as the above will have implications for what the best way forward for this is

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow-datafusion] alamb merged pull request #378: Add support for multiple partitions with SortExec (#362)

Posted by GitBox <gi...@apache.org>.

alamb merged pull request #378:
URL: https://github.com/apache/arrow-datafusion/pull/378


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org