You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/02/10 08:00:03 UTC

[GitHub] [spark] ulysses-you opened a new pull request #35473: [SPARK-38162][SQL] Optimize one row plan in normal and AQE Optimizer

ulysses-you opened a new pull request #35473:
URL: https://github.com/apache/spark/pull/35473

### What changes were proposed in this pull request?

- Add a new rule `OptimizeOneMaxRowPlan` in normal Optimizer and AQE Optimizer.
- Move the similar optimization of `EliminateSorts` into `OptimizeOneMaxRowPlan`, also update its comment and test

### Why are the changes needed?

Optimize the plan if its max row is equal to or less than 1 in these cases:

- if the child of sort max rows less than or equal to 1, remove the sort
- if the child of local sort max rows per partition less than or equal to 1, remove the local sort
- if the child of aggregate max rows less than or equal to 1 and it's grouping only (include the rewritten distinct plan), remove the aggregate
- if the child of aggregate max rows less than or equal to 1, set distinct to false in all aggregate expression

### Does this PR introduce _any_ user-facing change?

no, only change the plan

### How was this patch tested?

- Add a new test `OptimizeOneMaxRowPlanSuite` for normal optimizer
- Add test in `AdaptiveQueryExecSuite` for AQE optimizer

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org