You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2014/10/21 22:07:33 UTC

[jira] [Commented] (SPARK-4026) Write ahead log to synchronously write received data to HDFS and recover on driver failure

    [ https://issues.apache.org/jira/browse/SPARK-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14178967#comment-14178967 ] 

Apache Spark commented on SPARK-4026:
-------------------------------------

User 'tdas' has created a pull request for this issue:
https://github.com/apache/spark/pull/2882

> Write ahead log to synchronously write received data to HDFS and recover on driver failure
> ------------------------------------------------------------------------------------------
>
>                 Key: SPARK-4026
>                 URL: https://issues.apache.org/jira/browse/SPARK-4026
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Streaming
>            Reporter: Tathagata Das
>            Assignee: Tathagata Das
>            Priority: Critical
>
> As part of the effort to avoid data loss on Spark Streaming driver failure, we want to implement a write ahead log that can write received data to HDFS. This allows the received data to be persist across driver failures. So when the streaming driver is restarted, it can find and reprocess all the data that were received but not processed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org