You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/09/30 01:47:03 UTC

[GitHub] [spark] viirya commented on pull request #29804: [SPARK-32859][SQL] Introduce physical rule to decide bucketing dynamically

viirya commented on pull request #29804:
URL: https://github.com/apache/spark/pull/29804#issuecomment-701110070


   I think it is easy to encounter such example. For case 1, the sub-plan from root to bucketed table scan, does not contain [[hasInterestingPartition]] operator. If we cache a query plan like that, but we have other query uses the cached query plan, and the other query has an operator with [[hasInterestingPartition]] on top of the cached query. Then we won't do bucket scan even bucket scan can benefit the later query. It seems to me, this feature can easily cause unintentional regression like that.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org