You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Feng Gui (JIRA)" <ji...@apache.org> on 2017/03/01 16:46:45 UTC

[jira] [Commented] (SPARK-19779) structured streaming exist needless tmp file

    [ https://issues.apache.org/jira/browse/SPARK-19779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15890528#comment-15890528 ] 

Feng Gui commented on SPARK-19779:
----------------------------------

[~srowen] The `Background maintenance` don't clean files started with `temp`, so I think the temp file is not deleted. However, the temp file don't impact to get incorrect results for Structured Streaming Job.

> structured streaming exist needless tmp file 
> ---------------------------------------------
>
>                 Key: SPARK-19779
>                 URL: https://issues.apache.org/jira/browse/SPARK-19779
>             Project: Spark
>          Issue Type: Bug
>          Components: Structured Streaming
>    Affects Versions: 2.1.0
>            Reporter: Feng Gui
>            Priority: Minor
>
> The PR (https://github.com/apache/spark/pull/17012) can to fix restart a Structured Streaming application using hdfs as fileSystem, but also exist a problem that a tmp file of delta file is still reserved in hdfs. And Structured Streaming don't delete the tmp file generated when restart streaming job in future, so we need to delete the tmp file after restart streaming job.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org