You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/12/17 05:18:21 UTC

[GitHub] [spark] maryannxue opened a new pull request #34929: [SPARK-37670][SQL] Support predicate pushdown and column pruning for de-duped CTEs

maryannxue opened a new pull request #34929:
URL: https://github.com/apache/spark/pull/34929

### What changes were proposed in this pull request?

This PR adds predicate push-down and column pruning to CTEs that are not inlined as well as fixes a few potential correctness issues:
1) Replace (previously not inlined) CTE refs with Repartition operations at the end of logical plan optimization so that WithCTE is not carried over to physical plan. As a result, we can simplify the logic of physical planning, as well as avoid a correctness issue where the logical link of a physical plan node can point to `WithCTE` and lead to unexpected behaviors in AQE, e.g., class cast exceptions in DPP/DFP.
2) Pull (not inlined) CTE defs from subqueries up to the main query level, in order to avoid creating copies of the same CTE def during predicate push-downs and other transformations.
3) Make CTE IDs more deterministic by starting from 0 for each query.

### Why are the changes needed?

Improve de-duped CTEs' performance with predicate pushdown and column pruning; fixes de-duped CTEs' correctness issues.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Added UTs.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org