You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2017/09/07 08:47:00 UTC

[jira] [Commented] (SPARK-21942) ix DiskBlockManager crashing when a root local folder has been externally deleted by OS

    [ https://issues.apache.org/jira/browse/SPARK-21942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156702#comment-16156702 ] 

Sean Owen commented on SPARK-21942:
-----------------------------------

You'll want to configure a different scratch directory, or stop your system from deleting files in /tmp on a regular basis.

> ix DiskBlockManager crashing when a root local folder has been externally deleted by OS
> ---------------------------------------------------------------------------------------
>
>                 Key: SPARK-21942
>                 URL: https://issues.apache.org/jira/browse/SPARK-21942
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.6.1, 1.6.2, 1.6.3, 2.0.0, 2.0.1, 2.0.2, 2.1.0, 2.1.1, 2.2.0, 2.2.1, 2.3.0, 3.0.0
>            Reporter: Ruslan Shestopalyuk
>            Priority: Minor
>              Labels: storage
>             Fix For: 2.3.0
>
>
> DiskBlockManager has a notion of a "scratch" local folder(s), which can be configured via spark.local.dir option, and which defaults to the system's /tmp. The hierarchy is two-level, e.g. /blockmgr-XXX.../YY, where the YY part is a hash bit, to spread files evenly.
> Function DiskBlockManager.getFile expects the top level directories (blockmgr-XXX...) to always exist (they get created once, when the spark context is first created), otherwise it would fail with a message like:
> {code}
> ... java.io.IOException: Failed to create local dir in /tmp/blockmgr-XXX.../YY
> {code}
> However, this may not always be the case.
> In particular, *if it's the default /tmp folder*, there can be different strategies of automatically removing files from it, depending on the OS:
> * on the boot time
> * on a regular basis (e.g. once per day via a system cron job)
> * based on the file age
> The symptom is that after the process (in our case, a service) using spark is running for a while (a few days), it may not be able to load files anymore, since the top-level scratch directories are not there and DiskBlockManager.getFile crashes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org