You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Marcelo Vanzin (Jira)" <ji...@apache.org> on 2019/09/18 16:13:00 UTC

[jira] [Assigned] (SPARK-29105) SHS may delete driver log file of in progress application

     [ https://issues.apache.org/jira/browse/SPARK-29105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marcelo Vanzin reassigned SPARK-29105:
--------------------------------------

    Assignee: Marcelo Vanzin

> SHS may delete driver log file of in progress application
> ---------------------------------------------------------
>
>                 Key: SPARK-29105
>                 URL: https://issues.apache.org/jira/browse/SPARK-29105
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.0.0
>            Reporter: Marcelo Vanzin
>            Assignee: Marcelo Vanzin
>            Priority: Minor
>
> There's an issue with how the SHS cleans driver logs that is similar to the problem of event logs: because the file size is not updated when you write to it, the SHS fails to detect activity and thus may delete the file while it's still being written to.
> SPARK-24787 added a workaround in the SHS so that it can detect that situation for in-progress apps, replacing the previous solution which was too slow for event logs.
> But that doesn't work for driver logs because they do not follow the same pattern (different file names for in-progress files), and thus would require the SHS to open the driver log files on every scan, which is expensive.
> The old approach (using the {{hsync}} API) seems to be a good match for the driver logs, though, which don't slow down the listener bus like event logs do.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org