You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/05/13 06:00:18 UTC

[GitHub] [spark] cloud-fan commented on a change in pull request #24580: [SPARK-27674][SQL] the hint should not be dropped after cache lookup

cloud-fan commented on a change in pull request #24580: [SPARK-27674][SQL] the hint should not be dropped after cache lookup
URL: https://github.com/apache/spark/pull/24580#discussion_r283195657
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala
 ##########
 @@ -212,17 +213,18 @@ class CacheManager extends Logging {
   def useCachedData(plan: LogicalPlan): LogicalPlan = {
     val newPlan = plan transformDown {
       case command: IgnoreCachedData => command
-      // Do not lookup the cache by hint node. Hint node is special, we should ignore it when
-      // canonicalizing plans, so that plans which are same except hint can hit the same cache.
-      // However, we also want to keep the hint info after cache lookup. Here we skip the hint
-      // node, so that the returned caching plan won't replace the hint node and drop the hint info
-      // from the original plan.
-      case hint: ResolvedHint => hint
 
       case currentFragment =>
-        lookupCachedData(currentFragment)
-          .map(_.cachedRepresentation.withOutput(currentFragment.output))
-          .getOrElse(currentFragment)
+        lookupCachedData(currentFragment).map { cached =>
+          // After cache lookup, we should still keep the hints from the input plan.
 
 Review comment:
   It doesn't matter, because
   1. as a cache key, the lookup relies on `semanticEquals`, so having the hint node in the plan has no effect.
   2. the cache lookup returns `InMemoryRelation`, which has no hint.
   
   I think the behavior is pretty clear: for any query, the hint behavior should be the same no matter some sub-plans are cached or not.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org