You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "SaintBacchus (JIRA)" <ji...@apache.org> on 2015/06/26 03:25:05 UTC

[jira] [Closed] (SPARK-8163) CheckPoint mechanism did not work well when error happened in big streaming

     [ https://issues.apache.org/jira/browse/SPARK-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

SaintBacchus closed SPARK-8163.
-------------------------------
    Resolution: Not A Problem

Mistake the prolem.

> CheckPoint mechanism did not work well when error happened in big streaming
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-8163
>                 URL: https://issues.apache.org/jira/browse/SPARK-8163
>             Project: Spark
>          Issue Type: Bug
>          Components: Streaming
>    Affects Versions: 1.4.0
>            Reporter: SaintBacchus
>
> I tested it with Kafka DStream.
> Sometimes Kafka Producer had push a lot data to the Kafka Brokers, then Streaming Receiver wanted to pull this data without rate limite.
> At this first batch, Streaming may take 10 or more seconds to comsume this data(batch was 2 second).
> I wanted to describle what the Streaming do more detail at this moment:
> The SC was doing its job; the JobGenerator was still send new batchs to StreamingContext and StreamingContext writed this to the CheckPoint files;And the Receiver still was busy receiving the data from kafka and also tracked this events into CheckPoint.
> Then an error(unexcept error) occured, leading to shutdown the Streaming Application.
> Then we wanted to recover the application from check point files.But since the StreamingContext had record the next few batch, it would be recorvered from the last batch. So the Streaming had already missed the first batch and did not know what data had been actually comsumed by Receiver.
> Setting spark.streaming.concurrentJobs=2 could avoid this problem, but some application can not do this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org