You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/11/03 10:33:19 UTC

[GitHub] [spark] fred-db opened a new pull request, #38497: [SPARK-40999] Hint propagation to subqueries

fred-db opened a new pull request, #38497:
URL: https://github.com/apache/spark/pull/38497

### What changes were proposed in this pull request?

We add a hint field to the `SubqueryExpression` class, pull hints in subqueries into the hint field during `EliminateResolvedHint` and propagate this hint to joins formed from the subquery in `RewritePredicateSubquery`

### Why are the changes needed?

Currently, if a user tries to specify a query like the following, the hints on the subquery won't be respected.

```
SELECT * FROM target t WHERE EXISTS
(SELECT /*+ BROADCAST */ * FROM source s WHERE s.key = t.key)
```
This happens as hints are removed from the plan and pulled into joins in the beginning of the optimization stage, but subqueries are only turned into joins during optimization. As we remove any hints that are not below a join, we end up removing hints that are below a subquery.

### Does this PR introduce _any_ user-facing change?

Yes. Hints on subqueries will now work.

### How was this patch tested?

UTs to check whether hints are correctly propagated to joins formed from subqueries.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org