You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "mridulm (via GitHub)" <gi...@apache.org> on 2023/09/03 04:21:03 UTC

[GitHub] [spark] mridulm commented on a diff in pull request #42742: [SPARK-45025][CORE] Allow block manager memory store iterator to handle thread interrupt and perform task completion gracefully

mridulm commented on code in PR #42742:
URL: https://github.com/apache/spark/pull/42742#discussion_r1314096518


##########
core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala:
##########
@@ -220,7 +220,8 @@ private[spark] class MemoryStore(
     }
 
     // Unroll this block safely, checking whether we have exceeded our threshold periodically
-    while (values.hasNext && keepUnrolling) {
+    // and if no thread interrupts have been received.
+    while (values.hasNext && keepUnrolling && !Thread.currentThread().isInterrupted) {

Review Comment:
   @anishshri-db Thread interruption is a rare occurrence when compared to number of `putIterator` calls - so we should try to minimize the impact for the common case, as long as it is a reasonable additional cost when interruption does occur (having said that, please see more below).
   
   @JoshRosen It is way faster than I expected it to be [1] - while it is still slower than a local variable [2], the cost is really low when compared to the cost of other operations in the loop ... this should be just noise.
   
   Given this, while in general I prefer to minimize unnecessary cost, I am fine with the change.
   
   
   [1] Thanks for your comments, went digging into this - was fun !
   I knew it was intrinsic, but the uncontended reads are much faster than I had expected (I seem to be misremembering some stats) - and thanks for jdk14 ref, was not aware of that change.
   
   [2] 
   On my linux desktop: 2.175 +- 0.001 ns/op versus 2.424 +- 0.001 ns/op
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org