You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 05:35:40 UTC

[jira] [Updated] (SPARK-4545) If first Spark Streaming batch fails, it waits 10x batch duration before stopping

     [ https://issues.apache.org/jira/browse/SPARK-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon updated SPARK-4545:
--------------------------------
    Labels: bulk-closed  (was: )

> If first Spark Streaming batch fails, it waits 10x batch duration before stopping
> ---------------------------------------------------------------------------------
>
>                 Key: SPARK-4545
>                 URL: https://issues.apache.org/jira/browse/SPARK-4545
>             Project: Spark
>          Issue Type: Bug
>          Components: DStreams
>    Affects Versions: 1.1.0, 1.2.1
>            Reporter: Sean Owen
>            Priority: Major
>              Labels: bulk-closed
>
> (I'd like to track the issue raised at http://mail-archives.apache.org/mod_mbox/spark-dev/201411.mbox/%3CCAMAsSdKY=QCT0YUdrkvbVuqXdFCGp1+6g-=s71Fk8ZR4uaTK7g@mail.gmail.com%3E as a JIRA since I think it's a legitimate issue that I can take a look into, with some help.)
> This bit of {{JobGenerator.stop()}} executes, since the message appears in the logs:
> {code}
> def haveAllBatchesBeenProcessed = {
>   lastProcessedBatch != null && lastProcessedBatch.milliseconds == stopTime
> }
> logInfo("Waiting for jobs to be processed and checkpoints to be written")
> while (!hasTimedOut && !haveAllBatchesBeenProcessed) {
>   Thread.sleep(pollTime)
> }
> // ... 10x batch duration wait here, before seeing the next line log:
> logInfo("Waited for jobs to be processed and checkpoints to be written")
> {code}
> I think that {{lastProcessedBatch}} is always null since no batch ever
> succeeds. Of course, for all this code knows, the next batch might
> succeed and so is there waiting for it. But it should proceed after
> one more batch completes, even if it failed?
> {{JobGenerator.onBatchCompleted}} is only called for a successful batch.
> Can it be called if it fails too? I think that would fix it.
> Should the condition also not be {{lastProcessedBatch.milliseconds <=
> stopTime}} instead of == ?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org