You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Burak Yavuz (JIRA)" <ji...@apache.org> on 2017/07/11 02:21:00 UTC

[jira] [Closed] (SPARK-21370) Avoid doing anything on HDFSBackedStateStore.abort() when there are no updates to commit

     [ https://issues.apache.org/jira/browse/SPARK-21370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Burak Yavuz closed SPARK-21370.
-------------------------------
    Resolution: Not A Problem

> Avoid doing anything on HDFSBackedStateStore.abort() when there are no updates to commit
> ----------------------------------------------------------------------------------------
>
>                 Key: SPARK-21370
>                 URL: https://issues.apache.org/jira/browse/SPARK-21370
>             Project: Spark
>          Issue Type: Improvement
>          Components: Structured Streaming
>    Affects Versions: 2.1.1
>            Reporter: Burak Yavuz
>            Assignee: Burak Yavuz
>            Priority: Minor
>
> Currently the HDFSBackedStateStore sets it's state as UPDATING as it is initialized.
> For every trigger, we create two state stores, one used by "StateStoreRestore" operator to only read data and one by "StateStoreSave" operator to write updates. So, the "Restore" StateStore is read-only. This state store gets "aborted" after a task is completed, and this abort attempts to delete files
> This can be avoided if there is an INITIALIZED state and abort deletes files only when there is an update to the state store using "put" or "remove".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org