You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Tathagata Das (JIRA)" <ji...@apache.org> on 2015/08/05 00:30:05 UTC
[jira] [Created] (SPARK-9619) Restarting the receiver's
BlockGenerator does clear previous data
Tathagata Das created SPARK-9619:
------------------------------------
Summary: Restarting the receiver's BlockGenerator does clear previous data
Key: SPARK-9619
URL: https://issues.apache.org/jira/browse/SPARK-9619
Project: Spark
Issue Type: Bug
Components: Streaming
Reporter: Tathagata Das
Assignee: Tathagata Das
Priority: Minor
The internal default block generator that is used by receivers gets reused across receiver restarts. This can lead to duplicate data. This is sort-of-okay as receivers really provide at-least once guarantee at best. Furthermore Reliable receivers like the ReliableKafkaReceiver, did not reuse BlockGenerator objects hence did not have the problem.
The solution is to ensure that the internal buffer of the BlockGenerator is cleared every time it is started.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org