You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "ZiyueGuan (Jira)" <ji...@apache.org> on 2021/05/31 03:05:00 UTC
[jira] [Created] (SPARK-35570) Shuffle file leak with external
shuffle service enable
ZiyueGuan created SPARK-35570:
---------------------------------
Summary: Shuffle file leak with external shuffle service enable
Key: SPARK-35570
URL: https://issues.apache.org/jira/browse/SPARK-35570
Project: Spark
Issue Type: Bug
Components: Block Manager, Shuffle
Affects Versions: 3.1.2
Reporter: ZiyueGuan
Unlike rdd block, external shuffle service doesn't offer a cleaning up of shuffle file. The cleaning up of shuffle file mainly rely on alive executors to response the request from context cleaner. As long as the executor exit, the shuffle file left will not be cleaned until application exits. For streaming application or long running application, disk may run out.
I'm confused that shuffle file was left like above while the lifecycle of rdd block was properly handled. Is there any difference between them?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org