You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by szhem <gi...@git.apache.org> on 2018/11/19 11:19:15 UTC
[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter Leak
GitHub user szhem opened a pull request:
https://github.com/apache/spark/pull/23083
[SPARK-26114][CORE] ExternalSorter Leak
## What changes were proposed in this pull request?
This pull request fixes [SPARK-26114](https://issues.apache.org/jira/browse/SPARK-26114) issue that occurs when trying to reduce the number of partitions by means of coalesce without shuffling after shuffle-based transformations.
For the following data
```scala
import org.apache.hadoop.io._
import org.apache.hadoop.io.compress._
import org.apache.commons.lang._
import org.apache.spark._
// generate 100M records of sample data
sc.makeRDD(1 to 1000, 1000)
.flatMap(item => (1 to 100000)
.map(i => new Text(RandomStringUtils.randomAlphanumeric(3).toLowerCase) -> new Text(RandomStringUtils.randomAlphanumeric(1024))))
.saveAsSequenceFile("/tmp/random-strings", Some(classOf[GzipCodec]))
```
and the following job
```scala
import org.apache.hadoop.io._
import org.apache.spark._
import org.apache.spark.storage._
val rdd = sc.sequenceFile("/tmp/random-strings", classOf[Text], classOf[Text])
rdd
.map(item => item._1.toString -> item._2.toString)
.repartitionAndSortWithinPartitions(new HashPartitioner(1000))
.coalesce(10,false)
.count
```
... executed like the following
```bash
spark-shell \
--num-executors=5 \
--executor-cores=2 \
--master=yarn \
--deploy-mode=client \
--conf spark.executor.memory=1g \
--conf spark.dynamicAllocation.enabled=false \
--conf spark.executor.extraJavaOptions='-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp -Dio.netty.noUnsafe=true'
```
... executors are always failing with OutOfMemoryErrors.
The main issue is multiple leaks of ExternalSorter references.
For example, in case of 2 tasks per executor it is expected to be 2 simultaneous instances of ExternalSorter per executor but heap dump generated on OutOfMemoryError shows that there are more ones.
![run1-noparams-dominator-tree-externalsorter](https://user-images.githubusercontent.com/1523889/48703665-782ce580-ec05-11e8-95a9-d6c94e8285ab.png)
P.S. This PR does not cover cases with CoGroupedRDDs which use ExternalAppendOnlyMap internally, which itself can lead to OutOfMemoryErrors in many places.
## How was this patch tested?
- Existing unit tests
- New unit tests
- Job executions on the live environment
Here is the screenshot before applying this patch
![run3-noparams-failure-ui-5x2-repartition-and-sort](https://user-images.githubusercontent.com/1523889/48700395-f769eb80-ebfc-11e8-831b-e94c757d416c.png)
Here is the screenshot after applying this patch
![run3-noparams-success-ui-5x2-repartition-and-sort](https://user-images.githubusercontent.com/1523889/48700610-7a8b4180-ebfd-11e8-9761-baaf38a58e66.png)
And in case of reducing the number of executors even more the job is still stable
![run3-noparams-success-ui-2x2-repartition-and-sort](https://user-images.githubusercontent.com/1523889/48700619-82e37c80-ebfd-11e8-98ed-a38e1f1f1fd9.png)
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/szhem/spark SPARK-26114-externalsorter-leak
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/23083.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #23083
----
commit 9ca5ddbeb2022259ec79e60b3e81332466947d90
Author: Sergey Zhemzhitsky <sz...@...>
Date: 2018-11-05T22:31:59Z
Allow memory consumers and spillables to be optionally unregistered and not trackable on freeing up the memory
commit bf57de2c9d23c3cbac23817a3d0b6158739ae8aa
Author: Sergey Zhemzhitsky <sz...@...>
Date: 2018-11-05T22:34:15Z
allow to remove already registered task completion listeners if they dont need to be fired at task completion
commit e3531acf2751d4ba63e7fa353b66c70d534271df
Author: Sergey Zhemzhitsky <sz...@...>
Date: 2018-11-05T22:35:31Z
clean up readingIterator on stop
commit baa9656e9d9fa2d006ed6fb98257a213bcace588
Author: Sergey Zhemzhitsky <sz...@...>
Date: 2018-11-05T22:37:07Z
prevent capturing and holding sub-iterators after completion
commit 3cc54522545f43e0ee9ff9443ba3ea67e5fb9d5b
Author: Sergey Zhemzhitsky <sz...@...>
Date: 2018-11-05T22:40:13Z
cleaning up resourses as soon as they no longer needed, dont waiting till the end of the task
commit d36323e7db3d8dc5ec01a2ae46752342ff01fac5
Author: Sergey Zhemzhitsky <sz...@...>
Date: 2018-11-18T01:01:15Z
adding some unit tests
commit 12075ec265f0d09cd52865bb91155898b9ede523
Author: Sergey Zhemzhitsky <sz...@...>
Date: 2018-11-18T07:07:27Z
improved docs
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by szhem <gi...@git.apache.org>.
Github user szhem commented on a diff in the pull request:
https://github.com/apache/spark/pull/23083#discussion_r235757779
--- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala ---
@@ -99,6 +99,13 @@ private[spark] class TaskContextImpl(
this
}
+ override def remoteTaskCompletionListener(listener: TaskCompletionListener)
+ : this.type = synchronized {
+ onCompleteCallbacks -= listener
--- End diff --
Replaced ArrayBuffer with LinkedHashSet. Thank you!
Another interesting question is why these collections are traversed in the reverse order in [invokeLisneters](https://github.com/apache/spark/pull/23083/files#diff-df60223d4208aff4c67335528a55154cR135) like the following
```scala
private def invokeListeners[T](
listeners: Seq[T],
name: String,
error: Option[Throwable])(
callback: T => Unit): Unit = {
val errorMsgs = new ArrayBuffer[String](2)
// Process callbacks in the reverse order of registration
listeners.reverse.foreach { listener =>
...
}
}
```
I believe @hvanhovell could help to understand. @hvanhovell Could you please remind why task completion and error listeners are traversed in the reverse order (you seem to the the one who added the corresponding line)?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by szhem <gi...@git.apache.org>.
Github user szhem commented on a diff in the pull request:
https://github.com/apache/spark/pull/23083#discussion_r236057101
--- Diff: core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -727,9 +727,10 @@ private[spark] class ExternalSorter[K, V, C](
spills.clear()
forceSpillFiles.foreach(s => s.file.delete())
forceSpillFiles.clear()
- if (map != null || buffer != null) {
+ if (map != null || buffer != null || readingIterator != null) {
map = null // So that the memory can be garbage-collected
buffer = null // So that the memory can be garbage-collected
+ readingIterator = null // So that the memory can be garbage-collected
--- End diff --
@advancedxy I've tried to remove all the modifications except for this one and got OutOfMemoryErrors once again. Here are the details:
1. Now there are 4 `ExternalSorter` remained
2 of them are not closed ones ...
![1_readingiterator_isnull_nonclosed_externalsorter](https://user-images.githubusercontent.com/1523889/48973288-2218d180-f04d-11e8-9329-27b3edf33c48.png)
and 2 of them are closed ones ...
![2_readingiterator_isnull_closed_externalsorter](https://user-images.githubusercontent.com/1523889/48973295-483e7180-f04d-11e8-83cf-23361515363f.png)
as expected
2. There are 2 `SpillableIterator`s (which consume a significant part of memory) of already closed `ExternalSorter`s remained
![4_readingiterator_isnull_spillableiterator_of_closed_externalsorter](https://user-images.githubusercontent.com/1523889/48973318-cf8be500-f04d-11e8-912f-74be7420ca95.png)
3. These `SpillableIterator`s are referenced by `CompletionIterator`s ...
![6_completioniterator_of_blockstoreshufflereader](https://user-images.githubusercontent.com/1523889/48973357-a6b81f80-f04e-11e8-810f-dc8941430f34.png)
... which in their order seem to be referenced by the `cur` field ...
![7_coalescedrdd_compute_flatmap](https://user-images.githubusercontent.com/1523889/48973491-7e7df000-f051-11e8-8864-7e9e7f3f994b.png)
... of the standard `Iterator`'s `flatMap` that is used in the `compute` method of `CoalescedRDD`
![image](https://user-images.githubusercontent.com/1523889/48973401-7fae1d80-f04f-11e8-8cf2-043c808173d9.png)
Standard `Iterator`'s `flatMap` does not clean up its `cur` field before obtaining the next value for it which in its order will consume quite a lot of memory too
![image](https://user-images.githubusercontent.com/1523889/48973418-dfa4c400-f04f-11e8-8f0e-b464567d43de.png)
.. and in case of Spark that means that the previous iterator consuming the memory will live there while fetching the next value for it
![8_coalescedrdd_compute_flatmap_cur_isnotassigned](https://user-images.githubusercontent.com/1523889/48974089-0dddd000-f05f-11e8-8319-f7d1f778f381.png)
So I've returned the changes made to the `CompletionIterator` to reassign the reference of its sub-iterator to the `empty` iterator ...
![image](https://user-images.githubusercontent.com/1523889/48973472-27781b00-f051-11e8-86e1-cd6b87cd114b.png)
... and that has helped.
P.S. I believe that cleaning up the standard `flatMap`'s iterator `cur` field before calling `nextCur` could help too
```scala
def flatMap[B](f: A => GenTraversableOnce[B]): Iterator[B] = new AbstractIterator[B] {
private var cur: Iterator[B] = empty
private def nextCur() { cur = f(self.next()).toIterator }
def hasNext: Boolean = {
// Equivalent to cur.hasNext || self.hasNext && { nextCur(); hasNext }
// but slightly shorter bytecode (better JVM inlining!)
while (!cur.hasNext) {
cur = empty
if (!self.hasNext) return false
nextCur()
}
true
}
def next(): B = (if (hasNext) cur else empty).next()
}
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by advancedxy <gi...@git.apache.org>.
Github user advancedxy commented on the issue:
https://github.com/apache/spark/pull/23083
> For the task completion listener, I think it's an overkill to introduce a new API, do you know where exactly we leak the memory? and can we null it out when the ShuffleBlockFetcherIterator reaches to its end?
If I understand correctly, the memory is leaked because external sorter is referenced in `TaskCompletionListener` and it's only gced when the task is completed. However for `coalesce` or similar APIs, multiple `BlockStoreShuffleReader`s are created as there are multiple input sources, the internal sorter is not released until all shuffle readers are consumed and task is finished.
It's an overkill to introduce a new API. However, I think we can limited it into private[Spark] scope.
Like @szhem, I don't figure out another way to null out the sorter reference yet.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by advancedxy <gi...@git.apache.org>.
Github user advancedxy commented on the issue:
https://github.com/apache/spark/pull/23083
And another thing:
> P.S. This PR does not cover cases with CoGroupedRDDs which use ExternalAppendOnlyMap internally, which itself can lead to OutOfMemoryErrors in many places.
So do you mean CoGroupRDDs with multiple input sources with have similar problems? If so, can you create another Jira?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23083
Looking at the code, we are trying to fix 2 memory leaks: the task completion listener in `ShuffleBlockFetcherIterator`, and the `CompletionIterator`. If that's case, can you say that in the PR description?
For the task completion listener, I think it's an overkill to introduce a new API, do we know exactly where we leak the memory? and can we null it out when the `ShuffleBlockFetcherIterator` reaches to its end?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99309/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5437/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23083
**[Test build #4433 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4433/testReport)** for PR 23083 at commit [`12075ec`](https://github.com/apache/spark/commit/12075ec265f0d09cd52865bb91155898b9ede523).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23083
**[Test build #99309 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99309/testReport)** for PR 23083 at commit [`1723819`](https://github.com/apache/spark/commit/17238196719de1e68cbcb1eb930cb3176308e437).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter's readingItera...
Posted by advancedxy <gi...@git.apache.org>.
Github user advancedxy commented on a diff in the pull request:
https://github.com/apache/spark/pull/23083#discussion_r236060023
--- Diff: core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -727,9 +727,10 @@ private[spark] class ExternalSorter[K, V, C](
spills.clear()
forceSpillFiles.foreach(s => s.file.delete())
forceSpillFiles.clear()
- if (map != null || buffer != null) {
+ if (map != null || buffer != null || readingIterator != null) {
map = null // So that the memory can be garbage-collected
buffer = null // So that the memory can be garbage-collected
+ readingIterator = null // So that the memory can be garbage-collected
--- End diff --
Nice. Case well explained.
But I think you need to add corresponding test cases for `CompletionIterator` and `ExternalSorter`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by szhem <gi...@git.apache.org>.
Github user szhem commented on the issue:
https://github.com/apache/spark/pull/23083
Hi @davies, @advancedxy, @rxin,
You seem to be the last ones who touched the corresponding parts of the files in this PR.
Could you be so kind to take a look at it?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by advancedxy <gi...@git.apache.org>.
Github user advancedxy commented on a diff in the pull request:
https://github.com/apache/spark/pull/23083#discussion_r235357328
--- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala ---
@@ -99,6 +99,13 @@ private[spark] class TaskContextImpl(
this
}
+ override def remoteTaskCompletionListener(listener: TaskCompletionListener)
+ : this.type = synchronized {
+ onCompleteCallbacks -= listener
--- End diff --
If we are going to add the new interface, I think so.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5406/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99313/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by advancedxy <gi...@git.apache.org>.
Github user advancedxy commented on a diff in the pull request:
https://github.com/apache/spark/pull/23083#discussion_r235297396
--- Diff: core/src/main/scala/org/apache/spark/TaskContext.scala ---
@@ -127,9 +127,21 @@ abstract class TaskContext extends Serializable {
// Note that due to this scala bug: https://github.com/scala/bug/issues/11016, we need to make
// this function polymorphic for every scala version >= 2.12, otherwise an overloaded method
// resolution error occurs at compile time.
- addTaskCompletionListener(new TaskCompletionListener {
- override def onTaskCompletion(context: TaskContext): Unit = f(context)
- })
+ addTaskCompletionListener(TaskCompletionListenerWrapper(f))
+ }
+
+ /**
+ * Removes a (Java friendly) listener that is no longer needed to be executed on task completion.
+ */
+ def remoteTaskCompletionListener(listener: TaskCompletionListener): TaskContext
--- End diff --
you mean `removeTaskCompletionListener`? didn't you?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23083
**[Test build #99351 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99351/testReport)** for PR 23083 at commit [`1723819`](https://github.com/apache/spark/commit/17238196719de1e68cbcb1eb930cb3176308e437).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99360/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23083
**[Test build #99326 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99326/testReport)** for PR 23083 at commit [`1723819`](https://github.com/apache/spark/commit/17238196719de1e68cbcb1eb930cb3176308e437).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by szhem <gi...@git.apache.org>.
Github user szhem commented on a diff in the pull request:
https://github.com/apache/spark/pull/23083#discussion_r235314949
--- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala ---
@@ -99,6 +99,13 @@ private[spark] class TaskContextImpl(
this
}
+ override def remoteTaskCompletionListener(listener: TaskCompletionListener)
+ : this.type = synchronized {
+ onCompleteCallbacks -= listener
--- End diff --
Should we do the same thing (i.e. chaning ArrayBuffer to LinkedHashSet) for onFailureCallbacks too?
```scala
/** List of callback functions to execute when the task completes. */
@transient private val onCompleteCallbacks = new ArrayBuffer[TaskCompletionListener]
/** List of callback functions to execute when the task fails. */
@transient private val onFailureCallbacks = new ArrayBuffer[TaskFailureListener]
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23083
**[Test build #4433 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4433/testReport)** for PR 23083 at commit [`12075ec`](https://github.com/apache/spark/commit/12075ec265f0d09cd52865bb91155898b9ede523).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23083
retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by szhem <gi...@git.apache.org>.
Github user szhem commented on a diff in the pull request:
https://github.com/apache/spark/pull/23083#discussion_r235308451
--- Diff: core/src/main/scala/org/apache/spark/TaskContext.scala ---
@@ -127,9 +127,21 @@ abstract class TaskContext extends Serializable {
// Note that due to this scala bug: https://github.com/scala/bug/issues/11016, we need to make
// this function polymorphic for every scala version >= 2.12, otherwise an overloaded method
// resolution error occurs at compile time.
- addTaskCompletionListener(new TaskCompletionListener {
- override def onTaskCompletion(context: TaskContext): Unit = f(context)
- })
+ addTaskCompletionListener(TaskCompletionListenerWrapper(f))
+ }
+
+ /**
+ * Removes a (Java friendly) listener that is no longer needed to be executed on task completion.
+ */
+ def remoteTaskCompletionListener(listener: TaskCompletionListener): TaskContext
--- End diff --
Yep, seems that `v` was replaced with `t` on my keyboard)
Thanks a lot!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by szhem <gi...@git.apache.org>.
Github user szhem commented on a diff in the pull request:
https://github.com/apache/spark/pull/23083#discussion_r235769204
--- Diff: core/src/main/scala/org/apache/spark/shuffle/BlockStoreShuffleReader.scala ---
@@ -103,11 +116,26 @@ private[spark] class BlockStoreShuffleReader[K, C](
context.taskMetrics().incMemoryBytesSpilled(sorter.memoryBytesSpilled)
context.taskMetrics().incDiskBytesSpilled(sorter.diskBytesSpilled)
context.taskMetrics().incPeakExecutionMemory(sorter.peakMemoryUsedBytes)
+ val taskListener = new TaskCompletionListener {
+ override def onTaskCompletion(context: TaskContext): Unit = sorter.stop()
+ }
// Use completion callback to stop sorter if task was finished/cancelled.
- context.addTaskCompletionListener[Unit](_ => {
- sorter.stop()
- })
- CompletionIterator[Product2[K, C], Iterator[Product2[K, C]]](sorter.iterator, sorter.stop())
+ context.addTaskCompletionListener(taskListener)
+ CompletionIterator[Product2[K, C], Iterator[Product2[K, C]]](
+ sorter.iterator,
+ {
+ sorter.stop()
+ // remove task completion listener as soon as the sorter stops to prevent holding
+ // its references till the end of the task which may lead to memory leaks, for
+ // example, in case of processing multiple ShuffledRDDPartitions by a single task
+ // like in case of CoalescedRDD occurred after the ShuffledRDD in the same stage
+ // (e.g. rdd.repartition(1000).coalesce(10));
+ // note that holding sorter references till the end of the task also holds
+ // references to PartitionedAppendOnlyMap and PartitionedPairBuffer too and these
+ // ones may consume a significant part of the available memory
+ context.remoteTaskCompletionListener(taskListener)
--- End diff --
Great question! Honestly speaking I don't have pretty good solution right now.
TaskCompletionListener stops sorter in case of task failures, cancels, etc., i.e. in case of abnormal termination. In "happy path" case task completion listener is not needed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23083
**[Test build #99254 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99254/testReport)** for PR 23083 at commit [`1723819`](https://github.com/apache/spark/commit/17238196719de1e68cbcb1eb930cb3176308e437).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23083
**[Test build #99351 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99351/testReport)** for PR 23083 at commit [`1723819`](https://github.com/apache/spark/commit/17238196719de1e68cbcb1eb930cb3176308e437).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99267/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23083
**[Test build #99326 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99326/testReport)** for PR 23083 at commit [`1723819`](https://github.com/apache/spark/commit/17238196719de1e68cbcb1eb930cb3176308e437).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23083
**[Test build #99309 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99309/testReport)** for PR 23083 at commit [`1723819`](https://github.com/apache/spark/commit/17238196719de1e68cbcb1eb930cb3176308e437).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99254/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by advancedxy <gi...@git.apache.org>.
Github user advancedxy commented on a diff in the pull request:
https://github.com/apache/spark/pull/23083#discussion_r235300436
--- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ---
@@ -72,7 +73,8 @@ final class ShuffleBlockFetcherIterator(
maxBlocksInFlightPerAddress: Int,
maxReqSizeShuffleToMem: Long,
detectCorrupt: Boolean)
- extends Iterator[(BlockId, InputStream)] with DownloadFileManager with Logging {
+ extends Iterator[(BlockId, InputStream)] with DownloadFileManager with TaskCompletionListener
--- End diff --
I don't it's a good idea for ShuffleBlockFetchIterator to be a subclass of TaskCompletionListener.
What's wrong with original solution?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by advancedxy <gi...@git.apache.org>.
Github user advancedxy commented on a diff in the pull request:
https://github.com/apache/spark/pull/23083#discussion_r235299353
--- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala ---
@@ -99,6 +99,13 @@ private[spark] class TaskContextImpl(
this
}
+ override def remoteTaskCompletionListener(listener: TaskCompletionListener)
+ : this.type = synchronized {
+ onCompleteCallbacks -= listener
--- End diff --
I'm not use whether we should add `removeTaskCompletionListener` or not.
If we are going to add this method. Then this's an O(n) operation. Maybe we need to replace onCompletedCallbacks to a LinkedHashSet?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by szhem <gi...@git.apache.org>.
Github user szhem commented on the issue:
https://github.com/apache/spark/pull/23083
> So do you mean CoGroupRDDs with multiple input sources will have similar problems?
Yep, but a little bit different ones
> If so, can you create another Jira?
Will do it shortly.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23083
retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23083
retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23083
**[Test build #99267 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99267/testReport)** for PR 23083 at commit [`1723819`](https://github.com/apache/spark/commit/17238196719de1e68cbcb1eb930cb3176308e437).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by advancedxy <gi...@git.apache.org>.
Github user advancedxy commented on a diff in the pull request:
https://github.com/apache/spark/pull/23083#discussion_r235299656
--- Diff: core/src/main/scala/org/apache/spark/shuffle/BlockStoreShuffleReader.scala ---
@@ -103,11 +116,26 @@ private[spark] class BlockStoreShuffleReader[K, C](
context.taskMetrics().incMemoryBytesSpilled(sorter.memoryBytesSpilled)
context.taskMetrics().incDiskBytesSpilled(sorter.diskBytesSpilled)
context.taskMetrics().incPeakExecutionMemory(sorter.peakMemoryUsedBytes)
+ val taskListener = new TaskCompletionListener {
+ override def onTaskCompletion(context: TaskContext): Unit = sorter.stop()
+ }
// Use completion callback to stop sorter if task was finished/cancelled.
- context.addTaskCompletionListener[Unit](_ => {
- sorter.stop()
- })
- CompletionIterator[Product2[K, C], Iterator[Product2[K, C]]](sorter.iterator, sorter.stop())
+ context.addTaskCompletionListener(taskListener)
+ CompletionIterator[Product2[K, C], Iterator[Product2[K, C]]](
+ sorter.iterator,
+ {
+ sorter.stop()
+ // remove task completion listener as soon as the sorter stops to prevent holding
+ // its references till the end of the task which may lead to memory leaks, for
+ // example, in case of processing multiple ShuffledRDDPartitions by a single task
+ // like in case of CoalescedRDD occurred after the ShuffledRDD in the same stage
+ // (e.g. rdd.repartition(1000).coalesce(10));
+ // note that holding sorter references till the end of the task also holds
+ // references to PartitionedAppendOnlyMap and PartitionedPairBuffer too and these
+ // ones may consume a significant part of the available memory
+ context.remoteTaskCompletionListener(taskListener)
--- End diff --
Nice catch.
Liked I said in the above, do we have another way to remove reference to sorter?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by advancedxy <gi...@git.apache.org>.
Github user advancedxy commented on a diff in the pull request:
https://github.com/apache/spark/pull/23083#discussion_r235955118
--- Diff: core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -727,9 +727,10 @@ private[spark] class ExternalSorter[K, V, C](
spills.clear()
forceSpillFiles.foreach(s => s.file.delete())
forceSpillFiles.clear()
- if (map != null || buffer != null) {
+ if (map != null || buffer != null || readingIterator != null) {
map = null // So that the memory can be garbage-collected
buffer = null // So that the memory can be garbage-collected
+ readingIterator = null // So that the memory can be garbage-collected
--- End diff --
Hi @szhem , I discussed with wenchen offline. I think this is the key point. After nulling out `readingIterator`, `ExternalSorter` should released all the memories it occupied.
Yes, `ExternalSorter` is leaked in `TaskCompletionListener`, but it would already be stopped in `CompletionIterator` in happy path. The stopped sorter wouldn't occupy too much memory. The `readingIterator`
is occupying memory because it may reference `map/buffer.partitionedDestructiveSortedIterator`, which itself references `map/buffer`. So only nulling out map or buffer is not enough.
Can you try with this modification only and see whether OOM still occurs.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by szhem <gi...@git.apache.org>.
Github user szhem commented on the issue:
https://github.com/apache/spark/pull/23083
Hi @cloud-fan
> Looking at the code, we are trying to fix 2 memory leaks: the task completion listener in ShuffleBlockFetcherIterator, and the CompletionIterator. If that's case, can you say that in the PR description?
I've updated the description and the title of this PR correspondingly.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23083
thanks, merging to master/2.4!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter's readingItera...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/23083
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by szhem <gi...@git.apache.org>.
Github user szhem commented on a diff in the pull request:
https://github.com/apache/spark/pull/23083#discussion_r235319165
--- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ---
@@ -72,7 +73,8 @@ final class ShuffleBlockFetcherIterator(
maxBlocksInFlightPerAddress: Int,
maxReqSizeShuffleToMem: Long,
detectCorrupt: Boolean)
- extends Iterator[(BlockId, InputStream)] with DownloadFileManager with Logging {
+ extends Iterator[(BlockId, InputStream)] with DownloadFileManager with TaskCompletionListener
--- End diff --
The main reason is that TaskCompletionListener is added in one place (in `initialize` method) and needs to be removed in another one (in `cleanup` method).
![image](https://user-images.githubusercontent.com/1523889/48833554-abe64780-ed8c-11e8-9da0-ef826918a275.png)
Will introduce a field for TaskCompletionListener instead. Thank you!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter's readingItera...
Posted by szhem <gi...@git.apache.org>.
Github user szhem commented on a diff in the pull request:
https://github.com/apache/spark/pull/23083#discussion_r236083618
--- Diff: core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -727,9 +727,10 @@ private[spark] class ExternalSorter[K, V, C](
spills.clear()
forceSpillFiles.foreach(s => s.file.delete())
forceSpillFiles.clear()
- if (map != null || buffer != null) {
+ if (map != null || buffer != null || readingIterator != null) {
map = null // So that the memory can be garbage-collected
buffer = null // So that the memory can be garbage-collected
+ readingIterator = null // So that the memory can be garbage-collected
--- End diff --
I've added [test case for CompletionIterator](https://github.com/apache/spark/pull/23083/files#diff-444ed6b5e5333c3359cecec7d082396dR50).
Regarding `ExternalSorter` - taking into account that only the private api has been changed and there no similar test cases which verify that private `map` and `buffer` fields are set to `null` after sorter stops, don't you think that already existing tests cover the situation with `readingIterator` too?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by szhem <gi...@git.apache.org>.
Github user szhem commented on a diff in the pull request:
https://github.com/apache/spark/pull/23083#discussion_r235759169
--- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ---
@@ -72,7 +73,8 @@ final class ShuffleBlockFetcherIterator(
maxBlocksInFlightPerAddress: Int,
maxReqSizeShuffleToMem: Long,
detectCorrupt: Boolean)
- extends Iterator[(BlockId, InputStream)] with DownloadFileManager with Logging {
+ extends Iterator[(BlockId, InputStream)] with DownloadFileManager with TaskCompletionListener
--- End diff --
Introduced [the corresponding field](https://github.com/apache/spark/pull/23083/files#diff-27109eb30a77542d377c936e0d134420R156).
```scala
/**
* Task completion callback to be called in both success as well as failure cases to cleanup.
* It may not be called at all in case the `cleanup` method has already been called before
* task completion.
*/
private[this] val cleanupTaskCompletionListener = (_: TaskContext) => cleanup()
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by advancedxy <gi...@git.apache.org>.
Github user advancedxy commented on a diff in the pull request:
https://github.com/apache/spark/pull/23083#discussion_r235357628
--- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ---
@@ -72,7 +73,8 @@ final class ShuffleBlockFetcherIterator(
maxBlocksInFlightPerAddress: Int,
maxReqSizeShuffleToMem: Long,
detectCorrupt: Boolean)
- extends Iterator[(BlockId, InputStream)] with DownloadFileManager with Logging {
+ extends Iterator[(BlockId, InputStream)] with DownloadFileManager with TaskCompletionListener
--- End diff --
Another field sounds reasonable.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23083
retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5341/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23083
**[Test build #99313 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99313/testReport)** for PR 23083 at commit [`1723819`](https://github.com/apache/spark/commit/17238196719de1e68cbcb1eb930cb3176308e437).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23083
**[Test build #99254 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99254/testReport)** for PR 23083 at commit [`1723819`](https://github.com/apache/spark/commit/17238196719de1e68cbcb1eb930cb3176308e437).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23083
**[Test build #99360 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99360/testReport)** for PR 23083 at commit [`1723819`](https://github.com/apache/spark/commit/17238196719de1e68cbcb1eb930cb3176308e437).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5396/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23083
LGTM, thanks for your great work!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99326/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23083
**[Test build #99267 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99267/testReport)** for PR 23083 at commit [`1723819`](https://github.com/apache/spark/commit/17238196719de1e68cbcb1eb930cb3176308e437).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5428/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23083
ok to test
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99351/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23083
retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23083
retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5392/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5351/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23083
**[Test build #99313 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99313/testReport)** for PR 23083 at commit [`1723819`](https://github.com/apache/spark/commit/17238196719de1e68cbcb1eb930cb3176308e437).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter Leak
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23083
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23083
**[Test build #99360 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99360/testReport)** for PR 23083 at commit [`1723819`](https://github.com/apache/spark/commit/17238196719de1e68cbcb1eb930cb3176308e437).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org