You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/05/10 14:17:00 UTC

[GitHub] [spark] neilagupta commented on a diff in pull request #36496: [SPARK-39104][SQL] Synchronize isCachedColumnBuffersLoaded to avoid concurrency issue

neilagupta commented on code in PR #36496:
URL: https://github.com/apache/spark/pull/36496#discussion_r869294184


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala:
##########
@@ -238,7 +238,12 @@ case class CachedRDDBuilder(
   }
 
   def isCachedColumnBuffersLoaded: Boolean = {
-    _cachedColumnBuffers != null && isCachedRDDLoaded
+    if (_cachedColumnBuffers != null) {
+      _cachedColumnBuffers.synchronized {
+        return _cachedColumnBuffers != null && isCachedRDDLoaded
+      }
+    }
+    false
   }
 
   def isCachedRDDLoaded: Boolean = {

Review Comment:
   This change is really helpful, is it a good idea to add another synchronized block in `isCacheRDDLoaded` too? It seems `_cachedColumnBuffersAreLoaded` is also volatile and while there is no other call to `isCacheRDDLoaded`, adding that should be safe for another method to call it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org