You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/09/17 15:00:08 UTC

[GitHub] [spark] tomvanbussel commented on a change in pull request #29787: [SPARK-32911][CORE] Free memory in UnsafeExternalSorter.SpillableIterator.spill() when all records have been read.

tomvanbussel commented on a change in pull request #29787:
URL: https://github.com/apache/spark/pull/29787#discussion_r490322471



##########
File path: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java
##########
@@ -581,14 +590,15 @@ public void loadNext() throws IOException {
       try {
         synchronized (this) {
           loaded = true;
-          // Just consumed the last record from in memory iterator
+          // Just consumed the last record from the in-memory iterator.
           if (lastPage != null) {
             // Do not free the page here, while we are locking `SpillableIterator`. The `freePage`
             // method locks the `TaskMemoryManager`, and it's a bad idea to lock 2 objects in
             // sequence. We may hit dead lock if another thread locks `TaskMemoryManager` and
             // `SpillableIterator` in sequence, which may happen in
             // `TaskMemoryManager.acquireExecutionMemory`.
             pageToFree = lastPage;
+            allocatedPages.clear();

Review comment:
       `spill()` synchronizes on `UnsafeExternalSorter.this` when accessing `allocatedPages`. I presume this is to avoid concurrency issues with `cleanupResources`. However, I think we're pretty much screwed here anyway if there's a concurrent call to `cleanupResources` as that will free the page of the record that we are loading here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org