You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Marcelo Vanzin (Jira)" <ji...@apache.org> on 2019/09/17 00:03:00 UTC

[jira] [Created] (SPARK-29105) SHS may delete driver log file of in progress application

Marcelo Vanzin created SPARK-29105:
--------------------------------------

             Summary: SHS may delete driver log file of in progress application
                 Key: SPARK-29105
                 URL: https://issues.apache.org/jira/browse/SPARK-29105
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 3.0.0
            Reporter: Marcelo Vanzin


There's an issue with how the SHS cleans driver logs that is similar to the problem of event logs: because the file size is not updated when you write to it, the SHS fails to detect activity and thus may delete the file while it's still being written to.

SPARK-24787 added a workaround in the SHS so that it can detect that situation for in-progress apps, replacing the previous solution which was too slow for event logs.

But that doesn't work for driver logs because they do not follow the same pattern (different file names for in-progress files), and thus would require the SHS to open the driver log files on every scan, which is expensive.

The old approach (using the {{hsync}} API) seems to be a good match for the driver logs, though, which don't slow down the listener bus like event logs do.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org