You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/02/12 19:32:56 UTC

[GitHub] bersprockets opened a new pull request #23768: [SPARK-26851][SQL] Fix double-checked locking in CachedRDDBuilder

bersprockets opened a new pull request #23768: [SPARK-26851][SQL] Fix double-checked locking in CachedRDDBuilder
URL: https://github.com/apache/spark/pull/23768
 
 
   ## What changes were proposed in this pull request?
   
   According to Brian Goetz et al in Java Concurrency in Practice, the double checked locking pattern has worked since Java 5, but only if the resource is declared volatile:
   
   > Subsequent changes in the JMM (Java 5.0 and later) have enabled DCL to work if resource is made volatile, and the performance impact of this is small since volatile reads are usually only slightly more expensive than nonvolatile reads.
   
   CachedRDDBuilder. cachedColumnBuffers and CachedRDDBuilder.clearCache both use DCL to manage the resource ``_cachedColumnBuffers``. The missing ingredient is that ``_cachedColumnBuffers`` is not volatile.
   
   Therefore, this PR makes ``_cachedColumnBuffers`` volatile.
   
   ## How was this patch tested?
   
   - Existing SQL unit tests
   - Existing pyspark-sql tests
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org