You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 05:35:40 UTC
[jira] [Updated] (SPARK-4545) If first Spark Streaming batch fails,
it waits 10x batch duration before stopping
[ https://issues.apache.org/jira/browse/SPARK-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon updated SPARK-4545:
--------------------------------
Labels: bulk-closed (was: )
> If first Spark Streaming batch fails, it waits 10x batch duration before stopping
> ---------------------------------------------------------------------------------
>
> Key: SPARK-4545
> URL: https://issues.apache.org/jira/browse/SPARK-4545
> Project: Spark
> Issue Type: Bug
> Components: DStreams
> Affects Versions: 1.1.0, 1.2.1
> Reporter: Sean Owen
> Priority: Major
> Labels: bulk-closed
>
> (I'd like to track the issue raised at http://mail-archives.apache.org/mod_mbox/spark-dev/201411.mbox/%3CCAMAsSdKY=QCT0YUdrkvbVuqXdFCGp1+6g-=s71Fk8ZR4uaTK7g@mail.gmail.com%3E as a JIRA since I think it's a legitimate issue that I can take a look into, with some help.)
> This bit of {{JobGenerator.stop()}} executes, since the message appears in the logs:
> {code}
> def haveAllBatchesBeenProcessed = {
> lastProcessedBatch != null && lastProcessedBatch.milliseconds == stopTime
> }
> logInfo("Waiting for jobs to be processed and checkpoints to be written")
> while (!hasTimedOut && !haveAllBatchesBeenProcessed) {
> Thread.sleep(pollTime)
> }
> // ... 10x batch duration wait here, before seeing the next line log:
> logInfo("Waited for jobs to be processed and checkpoints to be written")
> {code}
> I think that {{lastProcessedBatch}} is always null since no batch ever
> succeeds. Of course, for all this code knows, the next batch might
> succeed and so is there waiting for it. But it should proceed after
> one more batch completes, even if it failed?
> {{JobGenerator.onBatchCompleted}} is only called for a successful batch.
> Can it be called if it fails too? I think that would fix it.
> Should the condition also not be {{lastProcessedBatch.milliseconds <=
> stopTime}} instead of == ?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org