You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2018/02/15 17:02:00 UTC
[jira] [Assigned] (SPARK-23438) DStreams could lose blocks with WAL
enabled when driver crashes
[ https://issues.apache.org/jira/browse/SPARK-23438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-23438:
------------------------------------
Assignee: Apache Spark
> DStreams could lose blocks with WAL enabled when driver crashes
> ---------------------------------------------------------------
>
> Key: SPARK-23438
> URL: https://issues.apache.org/jira/browse/SPARK-23438
> Project: Spark
> Issue Type: Bug
> Components: DStreams
> Affects Versions: 1.6.0
> Reporter: Gabor Somogyi
> Assignee: Apache Spark
> Priority: Critical
>
> There is a race condition introduced in SPARK-11141 which could cause data loss.
> This affects all versions since 1.6.0.
> Problematic situation:
> # Start streaming job with 2 receivers with WAL enabled.
> # Receiver 1 receives a block and does the following
> ** Writes a BlockAdditionEvent into WAL
> ** Puts the block into it's received block queue with ID 1
> # Receiver 2 receives a block and does the following
> ** Writes a BlockAdditionEvent into WAL
> # Spark allocates all blocks from it's received block queue and writes AllocatedBlocks(IDs=(1)) into WAL
> # Driver crashes
> # New Driver recovers from WAL
> # Realise block with ID 2 never processed
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org