You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2018/02/12 14:16:00 UTC

[jira] [Commented] (SPARK-23397) Scheduling delay causes Spark Streaming to miss batches.

    [ https://issues.apache.org/jira/browse/SPARK-23397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360789#comment-16360789 ] 

Sean Owen commented on SPARK-23397:
-----------------------------------

This is how it's supposed to work. Batches don't overlap. If one overruns, the rest are delayed.

> Scheduling delay causes Spark Streaming to miss batches.
> --------------------------------------------------------
>
>                 Key: SPARK-23397
>                 URL: https://issues.apache.org/jira/browse/SPARK-23397
>             Project: Spark
>          Issue Type: Bug
>          Components: DStreams
>    Affects Versions: 2.2.1
>            Reporter: Shahbaz Hussain
>            Priority: Major
>
> * For Complex Spark (Scala) based D-Stream based applications ,which requires creating Ex: 40 Jobs for every batch ,its been observed that ,batches does not get created on the specific time ,ex: if i started a Spark Streaming based application with batch interval as 20 seconds and application is creating 40 odd Jobs ,observe the next batch does not create 20 seconds later than previous job creation time.
>  * This is due to the fact that Job Creation is Single Threaded, if Job Creation delay is greater than Batch Interval time ,batch execution misses its schedule.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org